Football Scores, In-Play Stats & Momentum via script, using an API. Version 2

Post Reply
sniffer66
Posts: 1666
Joined: Thu May 02, 2019 8:37 am

As promised in the old thread , I've made some major changes to the code I posted to pull scores, stats and momentum data for matches from the SofaScore API

Old thread here:

viewtopic.php?f=50&t=24908

Note: The code corrupted when I pasted it into code tags previously so I have added it to the Google drive link listed in the Version 1 thread above. It's called "SS_Fuzzy_V1.0.au3". Just copy that to your c:\temp folder and open and run it in the Scite Editor.

There were a few issues with the old code, primarily the pattern matching between the BF market names and the ones used by SofaScore. In many cases the team names used are completely different, site by site. The download code was also indiscriminate, in that data for all live matches was being pulled on each loop - not very efficient.

So, I hit on the idea of creating an input\output loop in Guardian via the Import & Export CSV Rules, using only markets where the baf was deployed. The code then knows which markets to query for on SofaScore plus it allows for much more complex pattern matching in the code.

I'm using a 3 part pattern match now.

1. 2 strings match exactly in both match names, ignoring commonly used words
2. An algo using the Damerau–Levenshtein distance to return the edit distance between the 2 strings. i.e how many single character edits are needed to match them
(https://en.wikipedia.org/wiki/Damerau-L ... n_distance for any bored enough to check :D )
3. A simple check on whether we have an age match i.e U19, U23 etc or a womens match. There are often matches played the same time\day where the only difference is the "U19", "(W)" in the market name

Where we have 2 matches that both pattern match, I'm then ranking on the shortest Damerau–Levenshtein distance to determine which is the correct match

A step through then looks like this:

1. Deploy baf to required Match Odds markets in Guardian using a coupon\filter etc
2. Run script in Scite (Same install instructions as Version 1)
3. Baf creates a file in C:\Temp\CSV_Output with the market name appended, when an inplay state is detected
4. Script picks up exported file name, deletes it and checks if there is a pattern matched market listed in SS and it doesnt already exist, if so it queries its Event ID and adds it to an array to be queried each loop, until match end
5. Script queries the API for each match with an EventID, and returns the score, stats and momentum data to the import CSV for Guardian to import
6. At match end (status set in the API), the market is deleted from our query list automatically

Notes:

You should be able to leave the script running 24/7 as it is self maintainiing. I left it on all weekend with no crashes. Just load and apply your baf to the days markets daily etc

Scores, stats and Momentum are pulled for each match automatically. Where SS are only publishing the score or a cut down set of data then only those will be created in the import CSV. Momentum and stats tend to only be available for major leagues

I've changed the default value of all the SV's to 0, so you may want to cater for those in your automation i.e Is Not 0

SV's returned can be checked in the exported CSV file but I've added in a few more for the Momentum Graph
Note: Graph is on a 0 axis so 0 to +99 is in favour of Home, 0 to -99 in favour of Away


5MinAvgPressure (Average of last 5 mins)
10MinAvgPressure (Average of last 5 mins)
MatchAvgPressure (Average of Entire match)
Momentum1 (Current Momentum, 1m interval)
Momentum2 (Current Momentum -1, 1m interval)
Momentum3 etc
Momentum4
Momentum5
Time = 0
5MinPeakResult (Highest Momentum value in last 5m)
10MinPeakResult (Highest Momentum value in last 10m)
5MinLowResult (Lowest Momentum value in last 5m)
10MinLowResult (Lowest Momentum value in last 10m)

1HHighResult1 (at 45m, this returns the highest + peak in the 1st half)
1HHighResult2 (at 45m, this returns the 2nd highest + peak in the 1st half)
1HLowResult1 (at 45m, this returns the lowest - peak in the 1st half)
1HLowResult2 (at 45m, this returns the 2nd lowest - peak in the 1st half)

I put the last 4 in so the automation would have a handle on decent first half pressure for each team, ready for any 2nd half entries, without having to refer to individual stats

Note: The SV for Time is only available where there is a momentum graph as its uses the number of datapoints for the graph. It is more accurate than using BA in play time however as the 2nd half always starts on minute 46, irrelevant of 1st half ET

IMPORTANT:You need to create the following directory C:\Temp\CSV_Output manually, prior to using the baf\script for the first time

I've included an experimental 2nd Half LTD baf to demo the SV's created. No idea how this one plays out in real life. It does however , populate the watch list with the time, score and key stats (if available) via alert rules, so you can see at a glance whats happening in the game.
if you only want the stats data, then just delete everything but the first 4 lines in the baf, or copy those to your own.

Capture.JPG

I ran this all weekend and the pattern matching checked out on every match I looked at, though there were so many games I cant guarantee every one was 100% - too much work for one person lol

Anyway, hope this is useful. I've enjoyed the challenge of creating it and learnt quite a bit along the way

Cheers

Stu
Last edited by sniffer66 on Tue Mar 22, 2022 10:21 am, edited 1 time in total.
sniffer66
Posts: 1666
Joined: Thu May 02, 2019 8:37 am

Thanks for getting back to me, good to know that I'm following and understanding the code somewhat successfully.

For Brest vs Angers, Brest are "Stade Brestois 29" on SofaScore, so it probably only matched Angers. For Cadiz vs Villarreal, Cadiz are "Cádiz" so it probably was a special character issue. I also had a Belgian game yesterday where despite both teams having about half a dozen words in their names, none of them matched Betfair (Sint-Truidense VV vs K. Beerschot V.A. on SofaScore, but I don't have the BF names to hand). I'm wondering if it would be possible to maintain a lookup list that the script could try first before moving onto the pattern matching? I'd be happy to contribute to assembling and maintaining that if it's possible / useful? The reason I suggest it is that I think you'll get a lot of cases that can't be picked up, especially in the Premier League. For example, Man City and Man Utd would both be missed by the algorithm because they are "Manchester" on Betfair, and almost every other team in the league only has one word in their name. So unless they were specifically playing Aston Villa or Crystal Palace you would never get the pattern match to work (there's also West Ham, but I think "Ham" would be ignored, is that right?).

I'm also wondering what happens if some of the strings are the same? I trade the Japanese J league and there are a couple of "Tokyo" teams. So what would happen if they were playing each other? Would each "Tokyo" register separately, or would it only recognise one string as matching?

It would be good to find a way to solve this. I fully agree that it's better to have no score than the wrong score when using automation, but sometime you need to put bets on pre-kickoff and it would be frustrating to do that and then find that you don't have stats for the game.

Finally, a question about the .baf. How many times / how often does the "Export Data from CSV" rule need to run? I believe in your original file you had it set to rearm every 5 mins (300 seconds).

EDIT: Just thinking about the pattern matching a bit more. Could it be set to allow one "generic" word to register as a match, but not more than one?


Thanks!
JT
sniffer66
Posts: 1666
Joined: Thu May 02, 2019 8:37 am

sniffer66 wrote:
Tue Mar 22, 2022 10:00 am
Thanks for getting back to me, good to know that I'm following and understanding the code somewhat successfully.

For Brest vs Angers, Brest are "Stade Brestois 29" on SofaScore, so it probably only matched Angers. For Cadiz vs Villarreal, Cadiz are "Cádiz" so it probably was a special character issue. I also had a Belgian game yesterday where despite both teams having about half a dozen words in their names, none of them matched Betfair (Sint-Truidense VV vs K. Beerschot V.A. on SofaScore, but I don't have the BF names to hand). I'm wondering if it would be possible to maintain a lookup list that the script could try first before moving onto the pattern matching? I'd be happy to contribute to assembling and maintaining that if it's possible / useful? The reason I suggest it is that I think you'll get a lot of cases that can't be picked up, especially in the Premier League. For example, Man City and Man Utd would both be missed by the algorithm because they are "Manchester" on Betfair, and almost every other team in the league only has one word in their name. So unless they were specifically playing Aston Villa or Crystal Palace you would never get the pattern match to work (there's also West Ham, but I think "Ham" would be ignored, is that right?).

I'm also wondering what happens if some of the strings are the same? I trade the Japanese J league and there are a couple of "Tokyo" teams. So what would happen if they were playing each other? Would each "Tokyo" register separately, or would it only recognise one string as matching?

It would be good to find a way to solve this. I fully agree that it's better to have no score than the wrong score when using automation, but sometime you need to put bets on pre-kickoff and it would be frustrating to do that and then find that you don't have stats for the game.

Finally, a question about the .baf. How many times / how often does the "Export Data from CSV" rule need to run? I believe in your original file you had it set to rearm every 5 mins (300 seconds).

EDIT: Just thinking about the pattern matching a bit more. Could it be set to allow one "generic" word to register as a match, but not more than one?


Thanks!
JT
Hi JT

Copied your questions into the new thread...

All good points, and its a bit of collaboration that does help iron out any niggles on code like this. You don't always see issues\different approaches working on your own

1. I did consider using a lookup table for certain team names, but thought it would take a lot of maintenance\work. You'd need to find how each team is listed on both SS and BF and have an entry for both

2. For common words like City\Hapoel\Tokyo etc they are ignored and this is set in the "NameCheck function, starting on line 292. Easy enough to add additional words like "Tokyo as they are discovered. Just copy the 2 lines for each word and append to the end so a "No" is returned. My call was to disregard completely any common words and only match on unique ones

3. The import csv runs every 300s or until the Time <> 0 (only available when momentum data is there). I tried just running it once, at match start, but sometimes SS lag adding the match into the APi so it has to repeat. I need to come up with a way to stop that export just using the only common stat - "Score". However, if the score stays at 0-0 nothing has changed. Maybe I can pass back a "Stop Exporting" SV = 1 when we have a unique ID in the API for a match - that would do it :)

4. I'm sure the code can be improved for the pattern matching. It may be that I am doing the steps in the wrong order. Possibly running the fuzzy matching algo first to determine the matches that have the closest word fit, then doing the string count matching would work better. I'll have a play around when I have some time.
DrJAT
Posts: 48
Joined: Wed Feb 16, 2022 3:20 pm

sniffer66 wrote:
Tue Mar 22, 2022 10:14 am
1. I did consider using a lookup table for certain team names, but thought it would take a lot of maintenance\work. You'd need to find how each team is listed on both SS and BF and have an entry for both
I'd be very happy to help with this. I don't think it would take too long to cover the popular European leagues and I'd be very happy to contribute. I trade a bunch of leagues that are in unfriendly timezones for the UK so this would be a real game-changer for me. I was already planning on doing something similar anyway for a tool I'm hoping to build (once I learn the databse skills I need), so this would just be bringing a task further up my to-do list.
User avatar
Frogmella
Posts: 220
Joined: Mon May 30, 2011 2:44 pm
Location: Towcester

If you delete the first Version 2 thread and just leave the original one, readers won't have a link to the latest .baf file. May I suggest you edit the first post of this new thread and include a link to it.
asaele
Posts: 22
Joined: Fri Jul 02, 2021 8:45 am

Just noticed this upgrade, and very keen to try it!
Thanks a lot for making this, really useful.

Seems like it is running fine, but the "Sofascore_Final" file ends up containing just a single "1".
Only one match running though, Macarthur vs Melbourne. But that match is picked up perfectly by version 1.

Sounds like I am just missing something very basic here.

EDIT: Worked now on some other matches. No idea why this particular match was not picked up though....
sniffer66
Posts: 1666
Joined: Thu May 02, 2019 8:37 am

asaele wrote:
Sat Mar 26, 2022 9:38 am
Just noticed this upgrade, and very keen to try it!
Thanks a lot for making this, really useful.

Seems like it is running fine, but the "Sofascore_Final" file ends up containing just a single "1".
Only one match running though, Macarthur vs Melbourne. But that match is picked up perfectly by version 1.

Sounds like I am just missing something very basic here.

EDIT: Worked now on some other matches. No idea why this particular match was not picked up though....
No problem, glad it's useful.

Not sure why the issue on that match. Was there a big difference in the team naming between the two sites ? That's the usual issue.
It's a tough one, make it too loose and you get false positive matches, too tight and you dont get many\any
Would be so much easier if all sites used a standard name for each team :(
asaele
Posts: 22
Joined: Fri Jul 02, 2021 8:45 am

Feel a bit bad asking for more here, as this is just so great... but would red cards be possible to write to file as well?
Think that is the single piece missing here to have almost complete info on the game state.

The added momentum numbers per minute are really useful btw
sniffer66
Posts: 1666
Joined: Thu May 02, 2019 8:37 am

asaele wrote:
Sat Mar 26, 2022 12:24 pm
Feel a bit bad asking for more here, as this is just so great... but would red cards be possible to write to file as well?
Think that is the single piece missing here to have almost complete info on the game state.

The added momentum numbers per minute are really useful btw
It's not a problem, and I'm sure i had cards in at some point. I'm guessing you want 2 values, reds home and reds away ?

And glad you are finding those momentum values useful. I'm using them in my automation as well. Currently doing some analysis on build up values prior to a goal...
asaele
Posts: 22
Joined: Fri Jul 02, 2021 8:45 am

sniffer66 wrote:
Sat Mar 26, 2022 8:46 pm
It's not a problem, and I'm sure i had cards in at some point. I'm guessing you want 2 values, reds home and reds away ?

And glad you are finding those momentum values useful. I'm using them in my automation as well. Currently doing some analysis on build up values prior to a goal...
Great! Reds home and reds away would be perfect.

Yes, the specific momentum numbers per minute is much more flexible than just the 5 and 10 averages.

My angle here is to combine the various indicators your script supplies with a model based "fair value" based on the prematch odds (match odds and goals) and game state (goals, cards, time). The second part is also a bit complicated, but your script was what I needed to get started.
User avatar
Frogmella
Posts: 220
Joined: Mon May 30, 2011 2:44 pm
Location: Towcester

I cannot make this work.
My logfiles are just full of this
5/04/2022 19:47:43: [G_Auto 3] : Unable to read file: C:\Temp\Sofascore_Final.csv. Check file exists and isn't locked.
User avatar
Frogmella
Posts: 220
Joined: Mon May 30, 2011 2:44 pm
Location: Towcester

Disregard the above.I have "Shared" the file and it now has some data
sniffer66
Posts: 1666
Joined: Thu May 02, 2019 8:37 am

Frogmella wrote:
Tue Apr 05, 2022 8:07 pm
Disregard the above.I have "Shared" the file and it now has some data
Sounds like a permissions issue on the file\folder then. Glad it's all sorted.
Can't edit the OP any more (seems it locks after 24 hours)
DrJAT
Posts: 48
Joined: Wed Feb 16, 2022 3:20 pm

sniffer66 wrote:
Tue Mar 22, 2022 9:59 am
Note: Graph is on a 0 axis so 0 to +99 is in favour of Home, 0 to -99 in favour of Away
Hi Sniffer,

Has SofaScore changed how the momentum stats are scaled? I got a value of 115 last night in the match between Internacional and Avai (Brazil Serie A).
Cheers,
JT
sniffer66
Posts: 1666
Joined: Thu May 02, 2019 8:37 am

DrJAT wrote:
Mon May 02, 2022 12:54 pm
sniffer66 wrote:
Tue Mar 22, 2022 9:59 am
Note: Graph is on a 0 axis so 0 to +99 is in favour of Home, 0 to -99 in favour of Away
Hi Sniffer,

Has SofaScore changed how the momentum stats are scaled? I got a value of 115 last night in the match between Internacional and Avai (Brazil Serie A).
Cheers,
JT

Not that I'm aware of JT. But I have seen the odd spike over 100 in the past. Just assumed it was an error in their algo somewhere.
Curious if there was a goal soon after though...

Stu
Post Reply

Return to “Bet Angel Automation - Football”