Diving into Generative AI: A Beginner's Roadmap to Getting Started

A friend of mine reached out to ask “how do I get started with all this new Generative AI stuff?”. They were feeling a bit late the the party and wanted a place to start. He’s not alone - with the rapid advancements in Generative AI, many people are eager to explore and experiment with this technology. If you're feeling a bit behind and wondering where to start, this guide will provide you with practical steps and resources to get started with Generative AI.

Getting Started with Chatbots

First things first let’s get used to talking to the various Generative AI models. A really cost-effective way to try out a huge number of models is to put $5 into an openrouter account: this one account gives you access to most of the top models in one place including OpenAI, Anthropic, Mistral, Meta and more. It even allows you to select multiple models and compare their responses side by side as you chat. You can also create an API key for use with your own code or any automation tool.

Exercise: I’d suggest starting with comparing the outputs of OpenAI GPT-4 Turbo and GPT-3.5 Turbo for the same prompt. Then try a similar comparison with Anthropic Claude 3.5 Sonnet and Haiku. This will give you a sense of the performance and cost of these models in practice. This cost vs capability trade-off is one of the most important to understand.

Experimenting with Code Generation

One the the most significant applications of generative AI has been generating code.

Exercise: My first suggested project would be to try using GPT-4 or Claude-3 Opus to build a very small app, game or tool. This is what got me hooked initially. I would spend hours asking GPT-4 to help me write little bits of code. This experience of going back and forth with the model taught me what is now called “prompt engineering” - techniques to get the best results of of the model. You see the mistakes the model makes and learn how the way you prompt the model gives varying levels of quality in the responses. Even if you are not a coder by background I’d still give this step a try, as one of my biggest insights using GenAI was that it allows me to do things I’d never done before and this remains one of my most important lessons: GenAI unlocks my ability to experiment: when I get stuck, I am one prompt away from getting help.

AI assisted coding

Today I think the most advanced integrated development environment for AI assisted coding is “Cursor IDE” at https://cursor.sh/. It’s a fork of VS Code (one of the most popular IDEs) designed for AI code gen (see the features here) and to work with the most powerful models (GPT-4) and it adds the ability to index and search documentation websites to help better prompt the model. Better yet you can use it for free if you bring your own API key! Just add your API key in the setting area and update the API base URL if you are using openrouter.

If you are new to setting up a development environment you might need to fight some dragons, but don’t forget you can ask for help from a powerful AI model - it should be able to help you solve any issues that come up.

The other easier but slightly less powerful option is using a cloud based IDE to get started, nothing to install and great for sharing and getting help. For this try replit. The big downside here is their free AI model is not great so you can do lots of copy and paste from a powerful model or try their paid plan for access to their “advanced” AI model, but this is still probably not as good as GPT-4-Turbo or Claude3-Opus.

Exploring Vision Models

One of my early “holy shit” moments with GenAI was watching OpenAI’s Greg Brockman demo “napkin sketch to code” using GPT-4. This whole livestream is good but don’t miss the vision to code demo at 16:11.

A good way to try out a vision model is to upload a screenshot or image and asking questions about it. Try comparing the quality of vision in Claude, Gemini and GPT-4.

Again I suggest experimenting with various prompts to see if you can come up with something that does something interesting with an image. Here are some ideas: I once built a apple shortcut app that used the OpenAI API to describe a picture of a household objects so I could sell them on facebook marketplace more quickly, and I’ve also prototyped a grading assistant for a teacher friend to help give feedback on student homework. Come up with a fun or useful use of vision and implement it.

Keeping up with the firehose

A week in AI seems like months in normal time. To keep up to date I’d suggest subscribing to this excellent AI generated AI newsletter. It’s sources are all linked and as one example you can click through to follow their list of “high signal” twitter accounts. You can also join the relevant discord servers for direct access to the community.

More to come

I’m going to keep adding to this article as I help my friend get started in GenAI, let me know what you would add.

Don’t forget to join my discord server and introduce yourself!