I started by wanting to know a little more about AI so I asked GPT 4o "how is AI coded, how does it 'read', 'understand' and know what to do?". Cutting a long answer short it told me it's all about numbers. Everything, text/graphics/speech is reduced to numbers and then it looks for patterns. That was encouraging, patterns in numbers is something we are all interested in. Apparently all it's answers to questions put to it "using math and probabilities to say, “Based on everything I’ve seen, this is the most likely correct response.”". So maybe that explains some of the 'wrong' answers we get.
I then wanted to get to the bottom of this 'training' thing. Heard lots about it but never delved into what it really means. So I asked "can I personally train an AI bot on my knowledge for my personal use or does anything I 'teach' it go into the main pot of knowledge?". Well that's when I started to experience technical overload … "Hugging Face", "Retrieval-Augmented Generation (RAG)", "LandChain", "Llamindex", "Mistral", "LLaMA", "Ollama" etc. none of them meant anything to me so that when I placed the post on the 'main thread'. That when I also learned you can choose whether you will allow it to learn from your questions or not and that when i realised that if I had an 'open' bot using the group logic of the world it could be influenced by the "follow my bets and become a millionaire" YouTubers! So I decided private was the safe way to go.
I then moved on to required computing power for private setup. It suggested ...
1. Lightweight Models (Low-end or Mid-range PCs)
These are smaller LLMs like GPT4All, Mistral 7B, or TinyLLaMA, which are good for basic Q&A, note help, or task assistance.
RAM: 8–16 GB (16 GB is comfy)
Drive space: 3–10 GB
CPU: Quad-core (Intel i5/Ryzen 5 or better)
GPU: Not required, but helps (4–6 GB VRAM is enough)
Use case: Summarizing, chatting, helping with notes, answering based on uploaded documents.
Latency: 1–3 seconds per response
Models like LLaMA 2 13B, Mistral-Medium, or OpenChat clones.
RAM: 32 GB+ recommended
Drive space: 15–25 GB
CPU: High-end (Ryzen 9 or i9)
GPU: 8–12 GB VRAM (NVIDIA 3060/3070+)
Use case: More fluent chat, decent reasoning, document Q&A with context.
Latency: 0.5–2 seconds per response with GPU
You’re looking at models like LLaMA 2 70B, Mixtral, or GPT-J clones.
RAM: 64–128 GB+ (serious server)
Drive space: 40–70+ GB
GPU: 24 GB+ VRAM (RTX 4090+ or A100)
Use case: Advanced reasoning, full document summarization, creative tasks, multiturn conversation with memory.
Latency: Real-time on high-end GPUs; slow on CPU
It then asked me to provide the spec. for my PC but I wanted to keep control of the conversation so I asked what if any the advantages are in having a local installation compared with VPS, Custom GPT or Cloud and it produced this table ...
It summarised the options as such …
Use a Custom GPT in ChatGPT.
Easy, no setup. Just upload docs, set behaviors, and it “talks like you.”
Great if you're okay with cloud-based AI.
Local AI is the way.
You own the model. You can even unplug from the internet and still have your own mini ChatGPT.
Best for developers, researchers, or privacy nerds.
Cloud / VPS hosting (like on AWS, Azure, RunPod, etc.)
Perfect for building bots or apps, especially if you want to:
Share your bot with others,
Run APIs,
Train a custom model once and serve it 24/7.
Some folks run all three:
ChatGPT for quick, clean answers and GPT-4 quality.
Local AI for offline use and confidential stuff.
Cloud/VPS for more advanced tools like a website chatbot or internal app.
OK, now was the time to give it my PC's spec. because if not suitable for local install then I could forget that option. It advised "your setup is more than powerful enough". Great. It suggested I …
1. use Ollama for now — it's clean, fast, and optimized.
2. Download Mistral 7B or LLaMA 2 7B (Q4 or Q5 quantized) as it runs directly on CPU and can be integrated with LangChain or Python scripts later.
3. Add a Document Search Layer with LlamaIndex, LangChain, or ChromaDB and,
4. Build a Local Web interface using Streamlit or Gradio; Terminal/chatbot interface with local terminal and integrate with VS Code + Python Notebook for deeper analysis
Don't ask, as I said I'm learning so at the time of drafting this I have no idea what they all mean but it sounds good!

It then started asking me about my data and it's structure. I concentrated on my Horse Racing database advising it's SQL Server based and asked if the bot would be able to connect directly … it responded it can and in doing so it will be able to run SQL queries from natural language instructions. Brilliant, I struggle with the finer points of SQL.
It kept wanting to 'set this up', it did what in the sales world is called a 'trial close' after answering every question. I guess that part of it was trained by a double glazing salesman!

Apparently there should be no problem moving your setup at a later date say if you replace your PC or maybe decide to go VPS/Cloud based.
We then moved on to comparing what to install which it summarised as follows …
… and it recommended Mistral 7B via Ollama giving various reasons which I won't bore you with here but you can ask GPT. But it also confirmed I can install more than one model (say llama3) and switch between then using ollama (and No, I don't really know what that means!

As the conversation developed it summarised the "Bot's Job Description" as it put it as “Answer statistical questions and produce a table, by pulling structured data from SQL, combining it with external sources (like ratings/tipsters), and outputting a probability-based race assessment.” And that will require …
An SQL query layer (via LangChain or manual function)
A web scraping / API connector (for pulling online ratings/news)
A probability calculator. This could be: Rule-based logic (e.g. weight certain stats), Simple logistic regression or even an ensemble model — eventually (had to look that last one up!)
It summarised the blueprint for the bot as …
Component Tech
AI Model Mistral 7B via Ollama
Interface Python script or web (Gradio/Streamlit)
Data Source SQL Server 2008 R2 (via pyodbc/sqlalchemy)
Online Ratings Web scraping (BeautifulSoup/Selenium) or API (if available)
Win Probability Logic Custom Python logic (initially) + optional ML later
Output Markdown or HTML tables for readability
I was further pleased to be told the bot will be able to create a new database with tables to store the scraped data from online sources.
We then installed Ollama & Mistral 7B on my PC. I did ask if I needed all the contents of the core as it takes up a fair slice of Drive and could I remove some but was told the bot needed all it's leant as it's not subject specific and to remove it would be like removing my brain!
It then asked me to map the database which I did by way of a screen grab showing the relationships between the tables and in Excel I listed all fields in the relevant table, explaining their content and marking if relevant or not. And that's where I am as I draft this post. Next will be to connect the bot to the database on SQL Server and where to go from there.
I hope some find this project interesting and if you feel I did something wrong or could have done better please say so.