I remember when they set this up.
One rule was the stack sizes had to reset back the original amounts. This favoured the AI programme and that's how it ended up.
The bluffing aspect could be factored in on a randomiser. "If I call 50% of the time with a nut flush draw" then a 50/50 randomiser decides the decision. Same for other scenarios with a bet or overbet.
The hand charts that a poker player studies and learn so that it represents the best GTO play can be programmed in too.
Also, exploitative settings if a player falls into a certain way of playing or not playing enough hands...etc
With the way AI can learn from patterns and trends, I reckon one exists out there in some form and is being used online. Not the ones that are attached to a piece of software, but external ones where the player can key in data for each optimal play.