Google Cloud Intensive
DAY 04 Β· M2 Β· GENERATIVE AI
M2Welcome
⭐ 0 pts
DAY 04 Β· MODULE 2

Now, AI that creates. ✨

Module 1 gave you the foundation β€” now the exciting part. How AI creates β€” foundation models, prompts, tuning, and the agents that act for you.

Module 2 β€” 7 Chapters πŸ—ΊοΈ

This is the official Generative AI course, distilled. From the models underneath, all the way to building AI agents.

✨

1 Β· What is Gen AI

Creates & acts Β· the stack

πŸ’Ž

2 Β· Foundation Models

Gemini family & friends

πŸ› οΈ

3 Β· Idea to App

Vertex AI Studio Β· prompts

πŸŽ›οΈ

4 Β· Prompt Engineering

Temperature, Top-K, Top-P

πŸ”—

5 Β· Grounding & Tuning

RAG & fine-tuning

πŸ€–

6 Β· AI Agents

Chatbot β†’ agent β†’ agentic

πŸ—οΈ

7 Β· Build Agents on Google Cloud

Gemini Enterprise Β· Agent Builder Β· ADK

Seven chapters, but they connect. Once you get foundation models, the rest just stacks on top. Jom.
Chapter 1 Β· What is Gen AI

Two Things, Not One ✨

We met this in Module 1 β€” generative AI creates and acts. Let's open that up.

🎨 It creates content

multimodal output
  • Text, code, images, speech, video, even 3D
  • From a prompt β€” a question or instruction
  • Summaries, reports, Q&A chatbots, images & video

🦾 It takes action

through AI agents
  • Autonomous, goal-oriented action on your behalf
  • Automate workflows, book travel, schedule
  • We reach agents at the end of this module
Definition to keepGen AI is a type of AI that generates content and takes action for you.

The Gen AI Stack β€” 3 Layers πŸ›οΈ

Just like the AI architecture in Module 1, but now for generative AI. Bottom to top:

3 Β· Gen AI Applications Gemini Enterprise Β· NotebookLM Β· no-code
2 Β· Gen AI Development Vertex AI Studio Β· Agent Builder Β· Model Garden
1 Β· Foundation Models the intelligence β€” language, image, video
Exam fact: the Transformer (Google, 2017) is the architecture behind every modern Gen AI app; Gemini (2023) made it multimodal.
Foundation models are the brain at the bottom. Development tools in the middle. Ready-made apps on top. The whole module follows this stack, bottom up.
Chapter 2 Β· Foundation Models

What Is a Foundation Model? πŸ’Ž

The backbone of every Gen AI app. Here's the idea in plain terms:

πŸ“š

Trained on a LOT

Learns from massive existing text, images, video. That learning = training.

πŸ”’

Huge parameters

From millions β†’ trillions. More parameters = more capacity to learn.

🧩

General-purpose

Pre-trained for broad use, then specialised later.

One lineA foundation model is a large model β€” lots of parameters, lots of training data, lots of compute β€” that becomes the intelligence behind Gen AI apps.

Google's Models β€” Tap to Explore πŸ’Ž

πŸ•ΉοΈTap each card to open it. Goal: know which model to reach for β€” Gemini Pro/Flash, or a specialist. (Explore β€” no points.)

πŸ’Ž Gemini Pro

GENERAL Β· tap to expand
The most capable Gemini. For complex tasks needing advanced reasoning.

⚑ Gemini Flash

GENERAL Β· tap to expand
Optimised for speed & low latency β€” high-volume, real-time apps like chatbots.

πŸͺΆ Gemini Flash-Lite

GENERAL Β· tap to expand
The most cost-effective β€” high-volume tasks where time isn't critical (batch translation, summarising).

πŸ–ΌοΈ Imagen / 🎬 Veo

SPECIALTY Β· tap to expand
Imagen = image generation. Veo = video. Also Chirp (voice), Lyria (music).

πŸ”Ž Embeddings

SPECIALTY Β· tap to expand
Turns content into vectors for semantic search & data representation. Powers RAG (Chapter 5).

πŸ’‘ The big idea

tap to expand
Because Gemini is multimodal, it can often replace several specialists β€” handling text, image & video in one model.

Multimodal · Pre-trained vs Fine-tuned 🧠

Multimodal = one model that takes in and creates across text, images, audio & video. e.g. show Gemini a cookie photo β†’ it generates a recipe video.

πŸŽ“ Pre-trained

broad foundation
  • Trained on a huge general dataset
  • Like 12 years of school β€” literate, general
  • Horizontal AI β€” works across industries

🩺 Fine-tuned

specialised
  • Further trained on a small, field-specific dataset
  • Like medical school β€” a specialist
  • Vertical AI β€” retail, finance, healthcare
Same analogy as a doctor: general schooling first (pre-trained), then specialist training (fine-tuned). We go deeper on tuning in Chapter 5.
Try It Β· Chapter 2

Match the Model to the Job 🧩

πŸ•ΉοΈDrag each task onto the model you'd use. Goal: match the job to Pro, Flash, or a specialist. Correct β†’ +5⭐.
Complex task needing deep reasoning
Real-time chatbot, high volume
Generate a product image
Create a short video
Power a semantic search
πŸ’Ž Gemini Pro
⚑ Gemini Flash
πŸ–ΌοΈ Imagen
🎬 Veo
πŸ”Ž Embeddings
Try it yourself first, ya. Drag each card to where you think it belongs.
Chapter 3 Β· Idea to App

Meet Bea, Ann & Ian πŸ‘©β€πŸ’ΌπŸ‘¨β€πŸ’»

Three people at Cymbal Insurance β€” our guides for the rest of the module.

πŸ‘©β€πŸ’Ό

Bea Β· Analyst

No tech background. Wants to prototype an idea fast.

πŸ‘©β€πŸ’»

Ann Β· AI Developer

Wants to design & manage prompts.

πŸ‘¨β€πŸ”§

Ian Β· ML Engineer

Wants to deploy & fine-tune at scale.

Their gateway: Vertex AI StudioOne workshop to test prompts, tune models with your own data, ground with real info, and deploy β€” in low-code or even no-code.
Try It Β· Chapter 3

Anatomy of a Prompt β€” Tap Each Part πŸ§ͺ

πŸ•ΉοΈTap each highlighted line β†’ which prompt part it is. Goal: see what a strong prompt is built from β€” Task, Context, Examples. Each new part β†’ +2⭐.
You are a business analyst at Cymbal Insurance. Conduct a housing risk assessment for southern Los Angeles. Rate each risk 1–5, classify by type, and follow this report template…
Tap a highlighted line above to see which prompt component it is.
Task is a must. Context and examples are optional but powerful. No examples = zero-shot; with examples = few-shot.

Design vs Engineering Β· Idea β†’ App πŸ› οΈ

✍️ Prompt Design

writing one good prompt
  • Crafting a prompt to get the response you want
  • Be direct & specific Β· use structure Β· iterate

πŸ” Prompt Engineering

the whole iterative loop
  • Designing, refining & optimising prompts over time
  • Explore few-shot, chain-of-thought, RAG
The magic momentIn Vertex AI Studio, Bea & Ann click Build with Code β†’ Deploy as App, and a working web app is generated. Idea β†’ app, just like that.
Chapter 4 Β· Prompt Engineering

The Knobs You Can Turn πŸŽ›οΈ

After the prompt, you tune how the model picks its words. Four settings:

πŸ€–

Model

Pick the right one β€” Gemini Flash/Pro, or specialists. Vertex AI Studio even hosts Claude, Llama, GPT.

🌑️

Temperature

Controls randomness. Low = safe & typical. High = creative & unusual.

#️⃣

Top-K

Pick randomly from the K most-likely words. K=2 β†’ choose from the top 2.

🎯

Top-P

Pick from the smallest set of words whose probabilities add up to P.

You usually don't fuss with Top-K and Top-P. Temperature is the one you'll actually reach for most.
Try It Β· Chapter 4

Feel the Temperature 🌑️

πŸ•ΉοΈDrag the temperature slider β€” the next word shifts from safe β€œflowers” (low) to wild β€œbugs” (high). Goal: feel how temperature trades predictability for creativity. Touch both ends β†’ +5⭐.
The garden was full of beautiful flowers.
❄️ 0.0 2.0 πŸ”₯
β€”
Low temperature β†’ typical, safe answers (great for Q&A). High β†’ creative, unexpected (great for brainstorming).
Evaluate & refine in Vertex AI Studio: compare prompts side by side, add your own ground truth answer, and save reusable prompt templates with variables β€” like a function in plain language.
Chapter 5 Β· Grounding & Tuning

Keeping Answers Accurate πŸ”—

Foundation models are pre-trained β€” their knowledge can be outdated. Two fixes:

πŸ”— Grounding

the WHAT
  • Connect the model to trusted, current data
  • Answers get verified against the latest info
  • Ground with Google Search or your own data

πŸ“š RAG

the HOW
  • Retrieval-Augmented Generation
  • The method that implements grounding
  • Retrieves relevant data, then generates
Remember it as a pairGrounding = the what. RAG = the how.

The Tuning Spectrum 🎚️

Want to improve the model itself? Options run from light to heavy:

LIGHTEST

Prompt Design

Guide with words. Doesn't change the model.

β†’
MIDDLE

Parameter-Efficient
(Adapter) Tuning

Updates a small subset of parameters.

β†’
HEAVIEST

Full Fine-Tuning

Updates ALL parameters. Best quality, most compute.

Grounding vs fine-tuning: fine-tuning refines the model's internal knowledge & skill; grounding adds external, real-time facts. Different jobs.

Supervised Fine-Tuning 🏷️

The tuning Vertex AI supports today. You teach the model a new skill with labelled examples.

🏷️

Labelled pairs

Hundreds of input β†’ desired output examples, in a JSONL file.

🎯

Good for

Classification, summarising, extraction, chat β€” well-defined tasks.

πŸ“¦

Result

A new tuned model in the Model Registry, ready to deploy.

{"input": "The room was terrible. It needs major rework.", "output": "negative"} {"input": "Great interior layout, architecturally interesting.", "output": "positive"}
Each row = a prompt and the answer you want. The model learns to copy that behaviour. Same idea as supervised learning in Module 1 β€” just on a foundation model.
Chapter 6 Β· AI Agents

From Chatbot to Agentic AI πŸ€–

Gen AI is evolving. A chatbot answers. An agent acts. Agentic AI coordinates many agents.

ASK

Chatbot

You prompt, it answers. Conversational.

β†’
ACT

AI Agent

Connects to tools & data, takes action, observes feedback.

β†’
COORDINATE

Agentic AI

Multiple agents reasoning together on multi-step tasks.

Why agents matterFoundation models alone can't reach your internal docs or other apps. An agent connects to them and takes action β€” that's the value it adds.

What's Inside an Agent? 🧩

Three components working together β€” picture a body:

🧠

Model β€” the brain

The reasoning centre. Thinks, plans, decides the steps to reach the goal.

🦾

Tools β€” hands & senses

APIs (GET, POST…) to act on the world β€” send an email, fetch the weather.

πŸ”„

Orchestration β€” nervous system

The cyclical loop: take the decision β†’ use a tool β†’ feed the result back.

An AI agent =Model + Tools + Orchestration, all coordinated toward a goal.
Try It Β· Chapter 6

Match the Agent Component 🧩

πŸ•ΉοΈDrag each role onto its component β€” Model, Tools, or Orchestration. Goal: lock in brain vs hands vs nervous-system. Correct β†’ +5⭐.
Reasoning & decision centre
Calls an API to send an email
Manages the cycle of actions
Fetches live weather data
Plans the steps to reach the goal
Carries feedback back to the brain
🧠 Model
🦾 Tools
πŸ”„ Orchestration
Try it first, ya β€” brain, hands, or nervous system?
Chapter 7 Β· Build Agents on Google Cloud

The Agent Tool Stack πŸ—οΈ

Same three layers as before β€” now for building agents. Bottom to top:

3 Β· Applications Gemini Enterprise Β· Customer Engagement Suite Β· no-code
2 Β· Development Vertex AI Agent Builder Β· ADK Β· Agent Engine Β· Agent Garden
1 Β· Foundation Models Vertex AI Model Garden β€” the agent's brain
Model Garden = access to Google's & third-party models (the brain). Agent Builder = build agents end-to-end. Gemini Enterprise = no-code agents for business users.

Which Tool? Ease vs Flexibility 🧭

Pick by how much code you want to write versus how much control you need.

ToolCodeBest for
Gemini EnterpriseNo-codeBusiness users β€” ready-to-use, minimal setup
Agent Garden + BuilderLow-codeAnalysts β€” start from a sample & customise
ADK (Agent Dev Kit)Pro-codeEngineers β€” full control, deep integrations
NotebookLM β€” a standout agent in this stack: your personal AI research assistant. Add sources (PDFs, Drive, YouTube), then chat, summarise, or generate a podcast & study guide. Free for everyone.
More ease β†’ less flexibility. More flexibility β†’ more code. Choose by who's building and how custom it must be.
Knowledge Check Β· Module 2

Lock It In πŸ§ͺ

Q1What makes a model a "foundation model"?

It only does images
Large β€” many parameters, huge data, lots of compute
It runs on your laptop
It needs no training

Q2Which architecture (Google, 2017) underpins modern Gen AI?

BigQuery
TPU
K-means
The Transformer

Q3The iterative loop of refining & optimising prompts is:

Prompt design
Prompt engineering
Deployment
Clustering

Q4A reusable prompt with replaceable variables is a:

Ground truth
Prompt template
Foundation model
Endpoint

Q5Which tuning updates ALL the model's parameters?

Parameter-efficient tuning
Full fine-tuning
Prompt design
Grounding

Q6Which component is the agent's "brain"?

The model
The tools
The orchestration layer
The database

Q7A business user wants a no-code agent. Best choice?

Gemini Enterprise
ADK
Full fine-tuning
Compute Engine

Q8As you move from Gemini Enterprise β†’ ADK, you gain:

More flexibility, but write more code
Less control
No-code simplicity
Fewer options

Module 2 β€” You Can Now Explain… βœ…

Creates + Acts Transformer (2017) Gen AI Stack Foundation Model
Gemini Pro / Flash Multimodal Pre-trained vs Fine-tuned Task Β· Context Β· Examples
Zero / Few-shot Temperature / Top-K / Top-P Grounding & RAG Fine-tuning
Model + Tools + Orchestration Chatbot β†’ Agent β†’ Agentic Gemini Enterprise Β· ADK
Finish Module 2 in 2 stepsDo the hands-on lab first β€” then take the graded quiz.

πŸ§ͺ Step 1 Β· Lab β€” Gemini Multimodal with Vertex AI Studio

Build an app from a prompt, apply prompt best practices, and generate multimodal media. Open the lab β†’

πŸ“ Step 2 Β· Official Module 2 Quiz β†’

After the lab, take the graded quiz on Skills Boost. skills.google β€Ί course 593 β€Ί quiz 617895

From models to agents β€” you've got it. ✨
MODULE 2 COMPLETE

Next: AI Development Options. πŸ›€οΈ

You've seen how Gen AI is built. Next we compare the four ways to build any AI β€” pre-trained APIs, BigQuery ML, AutoML, and custom training β€” and how to choose.

πŸ›€οΈ

M3 Β· AI Dev Options

Start now β†’

πŸ”„

M4 Β· AI Dev Workflow

Coming soon

πŸ…

Then: the exam

You're getting there