Use Part.from_uri for images stored in Google Cloud Storage:
from google import genaifrom google.genai import typesclient = genai.Client(api_key='your-api-key')response = client.models.generate_content( model='gemini-2.5-flash', contents=[ 'What is this image about?', types.Part.from_uri( file_uri='gs://generativeai-downloads/images/scones.jpg', mime_type='image/jpeg', ), ],)print(response.text)
Supported image formats: JPEG, PNG, WebP, GIF
Use Part.from_bytes for local image files:
from google.genai import typeswith open('your_image_path.jpg', 'rb') as f: image_bytes = f.read()response = client.models.generate_content( model='gemini-2.5-flash', contents=[ 'What is this image about?', types.Part.from_bytes( data=image_bytes, mime_type='image/jpeg' ), ],)print(response.text)
Best for images under 20MB. For larger files, use the File API.
Upload images to the File API first (Gemini Developer API only):
# Upload the imagefile = client.files.upload(file='image.jpg')# Use it in generate_contentresponse = client.models.generate_content( model='gemini-2.5-flash', contents=['Describe this image', file])print(response.text)
Best for large images or when reusing the same image multiple times.
Process audio files for transcription, analysis, or understanding:
Local Audio (Bytes)
Cloud Storage Audio
File API (Long Audio)
from google.genai import typeswith open('audio_sample.mp3', 'rb') as f: audio_bytes = f.read()response = client.models.generate_content( model='gemini-2.5-flash', contents=[ types.Part.from_bytes( data=audio_bytes, mime_type='audio/mp3', ), 'Transcribe this audio.' ])print(response.text)
Supported audio formats: MP3, WAV, FLAC, AAC
from google.genai import typesresponse = client.models.generate_content( model='gemini-2.5-flash', contents=[ 'What is being discussed in this audio?', types.Part.from_uri( file_uri='gs://your-bucket/audio.mp3', mime_type='audio/mp3', ), ],)print(response.text)
For long audio files, upload to the File API first:
# Uploadaudio_file = client.files.upload(file='podcast.mp3')# Wait for processing to completewhile audio_file.state == 'PROCESSING': time.sleep(2) audio_file = client.files.get(name=audio_file.name)# Generate contentresponse = client.models.generate_content( model='gemini-2.5-flash', contents=[audio_file, 'Summarize this podcast.'])print(response.text)
Analyze video content for descriptions, summaries, or specific questions:
Cloud Storage Video
File API (Recommended)
Video Frames
from google.genai import typesresponse = client.models.generate_content( model='gemini-2.5-flash', contents=[ 'What happens in this video?', types.Part.from_uri( file_uri='gs://your-bucket/video.mp4', mime_type='video/mp4', ), ],)print(response.text)
Supported video formats: MP4, MOV, AVI, WebM, FLV, MPG
The File API is recommended for videos:
# Uploadvideo_file = client.files.upload(file='video.mp4')# Wait for processingimport timewhile video_file.state == 'PROCESSING': time.sleep(5) video_file = client.files.get(name=video_file.name)# Generate contentresponse = client.models.generate_content( model='gemini-2.5-flash', contents=[video_file, 'What happens in this video?'])print(response.text)
Video files typically require processing time. Always check the file state before using it.
Extract and analyze specific frames:
from google.genai import typesresponse = client.models.generate_content( model='gemini-2.5-flash', contents=[ 'Describe what happens at the 30 second mark', types.Part.from_uri( file_uri='gs://your-bucket/video.mp4', mime_type='video/mp4', ), ],)print(response.text)
# Upload PDFpdf_file = client.files.upload(file='document.pdf')# Wait for processingimport timewhile pdf_file.state == 'PROCESSING': time.sleep(2) pdf_file = client.files.get(name=pdf_file.name)# Ask questions about the PDFresponse = client.models.generate_content( model='gemini-2.5-flash', contents=['Summarize this document', pdf_file])print(response.text)
from google.genai import typesresponse = client.models.generate_content( model='gemini-2.5-flash', contents=[ 'What are the key findings in this research paper?', types.Part.from_uri( file_uri='gs://your-bucket/research-paper.pdf', mime_type='application/pdf', ), ],)print(response.text)
Analyze multiple PDFs together:
from google.genai import types# Upload two PDFsfile1 = client.files.upload(file='paper1.pdf')file2 = client.files.upload(file='paper2.pdf')# Use them togetherresponse = client.models.generate_content( model='gemini-2.5-flash', contents=[ 'Compare and contrast these two research papers.', file1, file2 ])print(response.text)
You can mix different media types in a single request:
from google.genai import types# Upload filesimage_file = client.files.upload(file='chart.png')audio_file = client.files.upload(file='presentation.mp3')response = client.models.generate_content( model='gemini-2.5-flash', contents=[ 'Based on this chart and audio presentation, ', image_file, audio_file, 'what are the main conclusions?' ])print(response.text)
# Uploadfile = client.files.upload(file='document.pdf')print(f"Uploaded: {file.name}")# Get file infofile_info = client.files.get(name=file.name)print(f"State: {file_info.state}")print(f"Size: {file_info.size_bytes} bytes")# List all filesfor f in client.files.list(): print(f"{f.name}: {f.state}")# Delete when doneclient.files.delete(name=file.name)