Greyhound Mystique
i think that kind of activity is within tolerances. i've had to in the past throttle crawls where there were async threads hitting the servers at xx calls per second. i learned that taking a slower run at these public services tended to keep you under the radar. you'll only know tbh once you have had it up and running day in day out for a week or so. normally any flags get raised pretty quickly tho, so if you're in the clear just now, i don't see anything changing..
[edit] - just looked. i presume that's point-by-point, tennis-power and statistics at the lower levels??
Cheers Jimjimibt wrote: ↑Thu Jan 07, 2021 5:05 pmi think that kind of activity is within tolerances. i've had to in the past throttle crawls where there were async threads hitting the servers at xx calls per second. i learned that taking a slower run at these public services tended to keep you under the radar. you'll only know tbh once you have had it up and running day in day out for a week or so. normally any flags get raised pretty quickly tho, so if you're in the clear just now, i don't see anything changing..
[edit] - just looked. i presume that's point-by-point, tennis-power and statistics at the lower levels??
Yes, only tennis. I already have statistics being grabbed every 1 minute or so with no issues. Points\Games\Sets is going to need a faster loop though or it will be innacurate. It's a max of 20s - 25s per break between points so needs to run a bit more often.
i'd suggest just trying it at 4-5 pings for an extended period evey day and see what gives. worst case scenario, you get blocked after 3-4 days, however, you could then change IP or use a switching proxy if this turns out to be the case.sniffer66 wrote: ↑Thu Jan 07, 2021 5:17 pmCheers Jimjimibt wrote: ↑Thu Jan 07, 2021 5:05 pmi think that kind of activity is within tolerances. i've had to in the past throttle crawls where there were async threads hitting the servers at xx calls per second. i learned that taking a slower run at these public services tended to keep you under the radar. you'll only know tbh once you have had it up and running day in day out for a week or so. normally any flags get raised pretty quickly tho, so if you're in the clear just now, i don't see anything changing..
[edit] - just looked. i presume that's point-by-point, tennis-power and statistics at the lower levels??
Yes, only tennis. I already have statistics being grabbed every 1 minute or so with no issues. Points\Games\Sets is going to need a faster loop though or it will be innacurate. It's a max of 20s - 25s per break between points so needs to run a bit more often.
basically, if it gets chopped but it's giving you the data you need and your stratgey is profitable, then a switching proxy at $20 a month is a no brainer. keep us posted.
Hi Archery, loving this thread. Very well done.
Having read through the various posts can you confirm, in terms of the initial analysis, that you have moved away from your original idea of averaging the last and best times to one where you use an adjusted last time ? If so, was there any obvious rationale behind that ?
Also, I know that you have automated much of the analysis work. Do you know which site is the source of your data extract is and do you make the adjustments for grade changes etc as part of the automation or do you manually do that ?
Thanks in advance and keep up the good work.
Conners
Having read through the various posts can you confirm, in terms of the initial analysis, that you have moved away from your original idea of averaging the last and best times to one where you use an adjusted last time ? If so, was there any obvious rationale behind that ?
Also, I know that you have automated much of the analysis work. Do you know which site is the source of your data extract is and do you make the adjustments for grade changes etc as part of the automation or do you manually do that ?
Thanks in advance and keep up the good work.
Conners
-
- Posts: 3219
- Joined: Thu Oct 24, 2019 8:25 am
- Location: Newport
Hi,conners wrote: ↑Thu Jan 07, 2021 10:39 pmHi Archery, loving this thread. Very well done.
Having read through the various posts can you confirm, in terms of the initial analysis, that you have moved away from your original idea of averaging the last and best times to one where you use an adjusted last time ? If so, was there any obvious rationale behind that ?
Also, I know that you have automated much of the analysis work. Do you know which site is the source of your data extract is and do you make the adjustments for grade changes etc as part of the automation or do you manually do that ?
Thanks in advance and keep up the good work.
Conners
I don’t personally take an average of the last 5 runs anymore. Just the last time they ran and adjust if the stepping up or down. My unscientific reasoning for this is the last 5 runs could be spread over a number of weeks or even months. I would rather know what they achieved last time out etc.
Now I have a database which is automatically updated daily from RP.
Finally I don’t look for a time difference >= 0.10s between fastest and next fastest dog in the race. I monitor the top 2 and then try to get matched on the one with biggest % increase over 10 seconds. Always looking to get a back matched at current lay_price - 1
Cheers,
That makes sense, Archery. Thanks for confirming.
I would actually say that the current version of the strategy better reflects how a typical betting shop punter would determine who he/she is going to put his/her money on. He/she will typically only have access to the printed pages of the Racing Pages that appear in the Betting Shop, so will devise their system/strategy using only that information.
Have you found certain times of day to be better than others ?
Either way, I hope that your prices continues to be profitable for you.
Conners
I would actually say that the current version of the strategy better reflects how a typical betting shop punter would determine who he/she is going to put his/her money on. He/she will typically only have access to the printed pages of the Racing Pages that appear in the Betting Shop, so will devise their system/strategy using only that information.
Have you found certain times of day to be better than others ?
Either way, I hope that your prices continues to be profitable for you.
Conners
-
- Posts: 3219
- Joined: Thu Oct 24, 2019 8:25 am
- Location: Newport
Yes,conners wrote: ↑Fri Jan 08, 2021 11:42 amThat makes sense, Archery. Thanks for confirming.
I would actually say that the current version of the strategy better reflects how a typical betting shop punter would determine who he/she is going to put his/her money on. He/she will typically only have access to the printed pages of the Racing Pages that appear in the Betting Shop, so will devise their system/strategy using only that information.
Have you found certain times of day to be better than others ?
Either way, I hope that your prices continues to be profitable for you.
Conners
Morning and afternoon racing is sh!te from a liquidity point of view unless they are A1 to A6 races. The 'OR' races are usaully good to trade at any point in the day but for other reasons I dont normally trade them.
Also, i now monitor all races from 10 mins out as the market making bots are being extra sneaky bastards lately. Probably because I keep posting.
Cheers,
Someone being able to use? Now the result does not appear!
spreadbetting wrote: ↑Sat Jan 02, 2021 6:42 pmI did a little tinkering today and removed the extract_times routine so it now simply checks the previous history to pick out the fastest time over the race distance instead of relying on Sporting life as they seemed to be a bit hit and miss. I'm sure the code will fall over at some point as it's only been run a couple of times today and bound to come across key errors or times where expected data isn't present. But I'll post it up as it may be of use to some as it can be amended to average the last 5 races, or races at the course etcArticalBadboy wrote: ↑Tue Dec 29, 2020 2:25 pmThanks for the reply SB
I think i understand...I mistakenly thought the 'extract_times' was a function that extracted both Best and Last from gh-racing-runner-greyhound-sub-info.
Thank you
Code: Select all
import json import re import requests import pprint from bs4 import BeautifulSoup from requests_html import HTMLSession def main(): session = HTMLSession() baseUrl = "https://www.sportinglife.com" results = [] res = requests.get("https://www.sportinglife.com/greyhounds/racecards") soup = BeautifulSoup(res.text, "html.parser") summary = list(filter(None, [link.find_parent('a', class_='') for link in soup.find_all('span', class_='gh-meeting-race-time')])) x = 0 for tag in summary: link = tag.get('href') res = session.get(baseUrl + link) soup = BeautifulSoup(res.text, "html.parser") race = soup.find('h1').get_text() grade = soup.find(class_='gh-racecard-summary-race-class gh-racecard-summary-always-open').get_text() distance = soup.find(class_='gh-racecard-summary-race-distance gh-racecard-summary-always-open').get_text().strip("m") Runners = dict() print((baseUrl + link)) data =json.loads( soup.find('script', type='application/json').string) #convert to dictionary for runner in data['props']['pageProps']['race']['runs']: Trap = runner['cloth_number'] Name = runner['greyhound']['name'] fastest_time=100 for time in runner['greyhound']['previous_results']: if time['distance'] == distance and time['run_time'] !="": if float(time['run_time'].strip("s")) <= fastest_time: fastest_time=float(time['run_time'].strip("s")) Runners[fastest_time] = str(Trap) + '. ' + Name if bool(Runners) == True and ('OR' in grade or 'A' in grade): x = sorted(((k, v) for k, v in Runners.items())) if (x[1][0] - x[0][0]) >= 0.1: timeDiff = round((x[1][0] - x[0][0]), 2) results.append(race + ', ' + x[0][1] + ', class ' + grade + ', time difference ' + str(timeDiff)) results.sort() file = open('dogs_1.txt', mode='w') for line in results: file.write(line+'\n') file.close() main()
-
- Posts: 3140
- Joined: Sun Jan 31, 2010 8:06 pm
Sporting Life changed their html code so I had to tweak it again so it now just runs from the json data rather than HTML
Code: Select all
import json
import re
import requests
from bs4 import BeautifulSoup
from requests_html import HTMLSession
def main():
session = HTMLSession()
baseUrl = "https://www.sportinglife.com"
results=[]
urls = []
res = requests.get("https://www.sportinglife.com/greyhounds/racecards")
soup = BeautifulSoup(res.text, "html.parser")
data =json.loads( soup.find('script', type='application/json').string) #convert to dictionary
for market in data['props']['pageProps']['meetings']:
for race in market['races']:
if race['race_stage'] == "Dormant":
urls.append('/greyhounds/racecards/' + race['date'] + '/' + race['course_name'].replace(" ", "-") + '/racecard/' + str(race['race_summary_reference']['id']))
x = 0
for link in urls:
res = session.get(baseUrl + link)
print(baseUrl + link)
soup = BeautifulSoup(res.text, "html.parser")
data =json.loads( soup.find('script', type='application/json').string)
grade = data['props']['pageProps']['race']['race_summary']['race_class']
distance = data['props']['pageProps']['race']['race_summary']['distance']
course = data['props']['pageProps']['race']['race_summary']['course_name']
race = data['props']['pageProps']['race']['race_summary']['date'] + "," + data['props']['pageProps']['race']['race_summary']['time'] + " " + course
Runners = dict()
print((baseUrl + link))
data =json.loads( soup.find('script', type='application/json').string) #convert to dictionary
for runner in data['props']['pageProps']['race']['runs']:
Trap = runner['cloth_number']
Name = runner['greyhound']['name']
fastest_time=100
for time in runner['greyhound']['previous_results']:
if time['distance'] == distance and time['run_time'] !="":
if float(time['run_time'].strip("s")) <= fastest_time:
fastest_time=float(time['run_time'].strip("s"))
Runners[fastest_time] = str(Trap) + '. ' + Name
if bool(Runners) == True and ('OR' in grade or 'A' in grade) and len(Runners.items())>1:
x = sorted(((k, v) for k, v in Runners.items()))
if (x[1][0] - x[0][0]) >= 0.1:
timeDiff = round((x[1][0] - x[0][0]), 2)
results.append(race + ', ' + x[0][1] + ', class ' + grade + ', time difference ' + str(timeDiff))
results.sort()
file = open('dogs_1.txt', mode='w')
for line in results: file.write(line+'\n')
file.close()
main()
Last edited by spreadbetting on Sun Jan 10, 2021 12:27 pm, edited 1 time in total.
the code stopped working again yesterday it was good but today it is notspreadbetting wrote: ↑Sat Jan 09, 2021 1:53 pmSporting Life changed their html code so I had to tweak it again so it now just runs from the json data rather than HTML
Code: Select all
import json import re import requests from bs4 import BeautifulSoup from requests_html import HTMLSession def main(): session = HTMLSession() baseUrl = "https://www.sportinglife.com" results=[] urls = [] res = requests.get("https://www.sportinglife.com/greyhounds/racecards") soup = BeautifulSoup(res.text, "html.parser") data =json.loads( soup.find('script', type='application/json').string) #convert to dictionary for market in data['props']['pageProps']['meetings']: for race in market['races']: if race['race_stage'] == "Dormant": urls.append('/greyhounds/racecards/' + race['date'] + '/' + race['course_name'].replace(" ", "-") + '/racecard/' + str(race['race_summary_reference']['id'])) x = 0 for link in urls: res = session.get(baseUrl + link) print(baseUrl + link) soup = BeautifulSoup(res.text, "html.parser") data =json.loads( soup.find('script', type='application/json').string) grade = data['props']['pageProps']['race']['race_summary']['race_class'] distance = data['props']['pageProps']['race']['race_summary']['distance'] course = data['props']['pageProps']['race']['race_summary']['course_name'] race = data['props']['pageProps']['race']['race_summary']['date'] + "," + data['props']['pageProps']['race']['race_summary']['time'] + " " + course Runners = dict() print((baseUrl + link)) data =json.loads( soup.find('script', type='application/json').string) #convert to dictionary for runner in data['props']['pageProps']['race']['runs']: Trap = runner['cloth_number'] Name = runner['greyhound']['name'] fastest_time=100 for time in runner['greyhound']['previous_results']: if time['distance'] == distance and time['run_time'] !="": if float(time['run_time'].strip("s")) <= fastest_time: fastest_time=float(time['run_time'].strip("s")) Runners[fastest_time] = str(Trap) + '. ' + Name if bool(Runners) == True and ('OR' in grade or 'A' in grade): x = sorted(((k, v) for k, v in Runners.items())) if (x[1][0] - x[0][0]) >= 0.1: timeDiff = round((x[1][0] - x[0][0]), 2) results.append(race + ', ' + x[0][1] + ', class ' + grade + ', time difference ' + str(timeDiff)) results.sort() file = open('dogs_1.txt', mode='w') for line in results: file.write(line+'\n') file.close() main()
-
- Posts: 3219
- Joined: Thu Oct 24, 2019 8:25 am
- Location: Newport
You will probably find that someone from sportinglife is monitoring this forum, every time an update is posted then they change the site structure.murdok wrote: ↑Sun Jan 10, 2021 10:25 amthe code stopped working again yesterday it was good but today it is notspreadbetting wrote: ↑Sat Jan 09, 2021 1:53 pmSporting Life changed their html code so I had to tweak it again so it now just runs from the json data rather than HTML
Code: Select all
import json import re import requests from bs4 import BeautifulSoup from requests_html import HTMLSession def main(): session = HTMLSession() baseUrl = "https://www.sportinglife.com" results=[] urls = [] res = requests.get("https://www.sportinglife.com/greyhounds/racecards") soup = BeautifulSoup(res.text, "html.parser") data =json.loads( soup.find('script', type='application/json').string) #convert to dictionary for market in data['props']['pageProps']['meetings']: for race in market['races']: if race['race_stage'] == "Dormant": urls.append('/greyhounds/racecards/' + race['date'] + '/' + race['course_name'].replace(" ", "-") + '/racecard/' + str(race['race_summary_reference']['id'])) x = 0 for link in urls: res = session.get(baseUrl + link) print(baseUrl + link) soup = BeautifulSoup(res.text, "html.parser") data =json.loads( soup.find('script', type='application/json').string) grade = data['props']['pageProps']['race']['race_summary']['race_class'] distance = data['props']['pageProps']['race']['race_summary']['distance'] course = data['props']['pageProps']['race']['race_summary']['course_name'] race = data['props']['pageProps']['race']['race_summary']['date'] + "," + data['props']['pageProps']['race']['race_summary']['time'] + " " + course Runners = dict() print((baseUrl + link)) data =json.loads( soup.find('script', type='application/json').string) #convert to dictionary for runner in data['props']['pageProps']['race']['runs']: Trap = runner['cloth_number'] Name = runner['greyhound']['name'] fastest_time=100 for time in runner['greyhound']['previous_results']: if time['distance'] == distance and time['run_time'] !="": if float(time['run_time'].strip("s")) <= fastest_time: fastest_time=float(time['run_time'].strip("s")) Runners[fastest_time] = str(Trap) + '. ' + Name if bool(Runners) == True and ('OR' in grade or 'A' in grade): x = sorted(((k, v) for k, v in Runners.items())) if (x[1][0] - x[0][0]) >= 0.1: timeDiff = round((x[1][0] - x[0][0]), 2) results.append(race + ', ' + x[0][1] + ', class ' + grade + ', time difference ' + str(timeDiff)) results.sort() file = open('dogs_1.txt', mode='w') for line in results: file.write(line+'\n') file.close() main()