Over my summer break I am taking baby steps in Python - using PyCharm Community 2020.3
I thought I would start out looking at my last three months of Greyhound Data, so I downloaded that in csv format from my Betfair P&L page.
Those familiar will know that the first column of the csv data is very messy and so I have created a short program to clean it up and prepare a revised excel spreadsheet.
In order to run the program you will need PyCharm Community 2020.3, you will need to have pandas and openpyxl installed and have excel.
After downloading my betfair data I named it "hounds.csv" and put it in the same folder as my PyCharm project. You will see in the red text below you will need to insert your own username into the pathway.
I plan to create a program to analyse this data.
Hope this is helfpul.
/////
import pandas as pd
df = pd.read_csv('hounds.csv')
df.dropna(inplace = True)
df.drop(columns=['Start time', 'Settled date'])
header_list = ['Market', 'Track', 'Date', 'Race', 'Length', 'Profit/Loss (AUD)']
df = df.reindex(columns=header_list)
new = df["Market"].str.split("/", n = 1, expand = True)
df['Market']=new[0]
newc=new[1].str.split(" ", n=2, expand=True)
df['Track']=newc[1]
newc=newc[2].str.split(":", expand=True)
newc['Date']=newc[0].str[-9:]
df['Date']=newc['Date']
newc['Race']=newc[1].str[:3]
df['Race']=newc['Race']
newc['Length']=newc[1].str[4:9]
df['Length']=newc['Length']
df.to_excel (r'C:\Users\XXXX\Desktop\export_dataframe.xlsx', index = False, header=True)
Python CSV Clean Up for Greyhounds
-
- Posts: 3140
- Joined: Sun Jan 31, 2010 8:06 pm
Same here, Dallas. I use VBA to parse the data files but always good to learn something new. I started learning python about a year ago but got knocked off track when covid shut everything down. Hopefully get back into coding in the new yer.
Had to really fiddle with my pycharm interpreter
and working directory (which helped a lot with understanding) to get this to work. Been whiling away my hours with some python learning but chemo and old brain makes for hard yakka
Thanks for this gazuty. Look forward to using pandas+python more so I can at least follow what you did


Thanks for this gazuty. Look forward to using pandas+python more so I can at least follow what you did

Well done McGoo. Hope you beat that bastard cancer. One day we will die, but every day before then we will live. Carpe diem.mcgoo wrote: ↑Sat Jan 02, 2021 1:05 amHad to really fiddle with my pycharm interpreterand working directory (which helped a lot with understanding) to get this to work. Been whiling away my hours with some python learning but chemo and old brain makes for hard yakka
![]()
Thanks for this gazuty. Look forward to using pandas+python more so I can at least follow what you did![]()
Gazuty
This is a great little course (dont pay more than 12 GBP for it though
Regards
Peter
https://www.udemy.com/course/python-for-excel-users/
This is a great little course (dont pay more than 12 GBP for it though
Regards
Peter
https://www.udemy.com/course/python-for-excel-users/
-
- Posts: 12
- Joined: Sat Sep 21, 2024 1:06 pm
Hey did you have any luck setting up a greyhounds CSV? I'm currently trying to create one and would love to ask a couple of questions if so!
- MemphisFlash
- Posts: 2329
- Joined: Fri May 16, 2014 10:12 pm
how would you add code so that Start Time is included in the clean up?
The lineMemphisFlash wrote: ↑Fri Mar 21, 2025 1:52 pmhow would you add code so that Start Time is included in the clean up?
Is dropping those columns so basically you need to leave them in and decide what format you want it in or if it's split into date and time.df.drop(columns=['Start time', 'Settled date'])
To be honest I'd have thought you'd be better using VBA to process the CSV files as most of your stuff is done in excel so setting up a function would be more useful to convert and stick on your sheet for analysis.
Some VBA, probably needs a little tweaking
Code: Select all
Sub ProcessBettingPandL()
Dim ws As Worksheet
Dim filePath As String
Dim fileNum As Integer
Dim lineData As String
Dim dataArray As Variant
Dim outputRow As Integer
Dim splitMarket As Variant
Dim splitDetails As Variant
Dim eventDetails As String
Dim splitEvent As Variant
Dim i As Integer
' Use the active sheet and clear previous data
Set ws = ActiveSheet
ws.Cells.Clear
' Prompt user to select the file
With Application.FileDialog(msoFileDialogFilePicker)
.Title = "Select BettingPandL.csv"
.Filters.Clear
.Filters.Add "CSV Files", "*.csv", 1
.AllowMultiSelect = False
If .Show <> -1 Then Exit Sub
filePath = .SelectedItems(1)
End With
' Open the file
fileNum = FreeFile
Open filePath For Input As #fileNum
' Write headers
ws.Cells(1, 1).Value = "Sport"
ws.Cells(1, 2).Value = "Track"
ws.Cells(1, 3).Value = "Grade"
ws.Cells(1, 4).Value = "Distance"
ws.Cells(1, 5).Value = "Date"
ws.Cells(1, 6).Value = "Time"
ws.Cells(1, 7).Value = "Profit/Loss (£)"
outputRow = 2
' Read and process file line by line
Line Input #fileNum, lineData ' Skip the header row
Do While Not EOF(fileNum)
Line Input #fileNum, lineData
dataArray = Split(lineData, ",") ' Split CSV row into columns
' Process "Market" column
splitMarket = Split(dataArray(0), " / ") ' Split by " / "
' Extract event details (everything after the slash)
eventDetails = Trim(splitMarket(1))
' Split the event details by " : " to separate the date + race info
splitDetails = Split(eventDetails, " : ")
' Ensure there is data after " : "
If UBound(splitDetails) < 1 Then GoTo NextRow
' Extract course (track) name from the portion **before the date**
splitEvent = Split(splitDetails(0), " ") ' Split by space
' Track name is everything **before the last two words (date)**
Dim trackName As String
Dim numParts As Integer
numParts = UBound(splitEvent)
' Reconstruct track name (everything before the last two elements)
trackName = ""
If numParts > 1 Then
Dim j As Integer
For j = 0 To numParts - 2 ' Exclude last two (date)
trackName = trackName & splitEvent(j) & " "
Next j
trackName = Trim(trackName) ' Remove trailing space
Else
trackName = splitEvent(0) ' If only one word, assume it's the track
End If
' Extract the race details (Grade + Distance)
Dim raceDetails As String
raceDetails = Trim(splitDetails(1)) ' FIX: Use it as is
' Handle "To Be Placed"
If raceDetails = "To Be Placed" Then
ws.Cells(outputRow, 1).Value = splitMarket(0) ' Sport
ws.Cells(outputRow, 2).Value = trackName ' Track
ws.Cells(outputRow, 3).Value = "" ' No grade
ws.Cells(outputRow, 4).Value = "To Be Placed" ' Assign "To Be Placed" to Distance
Else
' Split into Grade and Distance
splitEvent = Split(raceDetails, " ")
Dim grade As String, distance As String
grade = ""
distance = ""
' Ensure we have enough elements before assigning values
If UBound(splitEvent) >= 1 Then
grade = splitEvent(0) ' First part is always Grade
distance = splitEvent(1) ' Second part is always Distance
ElseIf UBound(splitEvent) = 0 Then
' If there's only one part, it's assumed to be the Grade
grade = splitEvent(0)
distance = ""
End If
' Assign to correct columns
ws.Cells(outputRow, 1).Value = splitMarket(0) ' Sport
ws.Cells(outputRow, 2).Value = trackName ' Track
ws.Cells(outputRow, 3).Value = grade ' Grade
ws.Cells(outputRow, 4).Value = distance ' Distance
End If
' Split "Start Time" into Date and Time
Dim startDateTime As String
startDateTime = Trim(dataArray(1)) ' Get the start time column
Dim splitDateTime As Variant
splitDateTime = Split(startDateTime, " ") ' Split by space
If UBound(splitDateTime) = 1 Then
ws.Cells(outputRow, 5).Value = splitDateTime(0) ' Date
ws.Cells(outputRow, 6).Value = splitDateTime(1) ' Time
Else
ws.Cells(outputRow, 5).Value = startDateTime ' If no space, keep as Date
ws.Cells(outputRow, 6).Value = "" ' Leave Time blank
End If
' Copy "Profit/Loss (£)" column (Skip Settled Date)
ws.Cells(outputRow, 7).Value = dataArray(3) ' Profit/Loss
NextRow:
outputRow = outputRow + 1
Loop
' Close the file
Close #fileNum
End Sub
- MemphisFlash
- Posts: 2329
- Joined: Fri May 16, 2014 10:12 pm
thank you for your reply, very informative.