Gemini APIを使う

[2024-11-09] Gemini is now accessible from the OpenAI Library というわけでOpenAIのAPIを使うの方法がそのまま使えるようになりました。

Gemini API: Quickstart with Pythonに従って、Pythonパッケージを

pip install google-generativeai

でインストールすれば、

import google.generativeai as genai

で使えるようになる。

Google AI StudioのAPI keysで「Create API key in new project」を選んでAPIキーを発行してもらう。キーは

export GOOGLE_API_KEY="..."

のようにして環境変数としておくと勝手に使ってもらえるが、うまくいかなければ

genai.configure(api_key="...")

で指定する。

list(genai.list_models())

と打ち込めば、使えるモデルが一覧できる。例:

[Model(name='models/chat-bison-001',
       base_model_id='',
       version='001',
       display_name='PaLM 2 Chat (Legacy)',
       description='A legacy text-only model optimized for chat conversations',
       input_token_limit=4096,
       output_token_limit=1024,
       supported_generation_methods=['generateMessage', 'countMessageTokens'],
       temperature=0.25,
       max_temperature=None,
       top_p=0.95,
       top_k=40),
 Model(name='models/text-bison-001',
       base_model_id='',
       version='001',
       display_name='PaLM 2 (Legacy)',
       description='A legacy model that understands text and generates text as an output',
       input_token_limit=8196,
       output_token_limit=1024,
       supported_generation_methods=['generateText', 'countTextTokens', 'createTunedTextModel'],
       temperature=0.7,
       max_temperature=None,
       top_p=0.95,
       top_k=40),
 Model(name='models/embedding-gecko-001',
       base_model_id='',
       version='001',
       display_name='Embedding Gecko',
       description='Obtain a distributed representation of a text.',
       input_token_limit=1024,
       output_token_limit=1,
       supported_generation_methods=['embedText', 'countTextTokens'],
       temperature=None,
       max_temperature=None,
       top_p=None,
       top_k=None),
 Model(name='models/gemini-1.0-pro-latest',
       base_model_id='',
       version='001',
       display_name='Gemini 1.0 Pro Latest',
       description=('The best model for scaling across a wide range of tasks. This is the latest '
                    'model.'),
       input_token_limit=30720,
       output_token_limit=2048,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=0.9,
       max_temperature=None,
       top_p=1.0,
       top_k=None),
 Model(name='models/gemini-1.0-pro',
       base_model_id='',
       version='001',
       display_name='Gemini 1.0 Pro',
       description='The best model for scaling across a wide range of tasks',
       input_token_limit=30720,
       output_token_limit=2048,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=0.9,
       max_temperature=None,
       top_p=1.0,
       top_k=None),
 Model(name='models/gemini-pro',
       base_model_id='',
       version='001',
       display_name='Gemini 1.0 Pro',
       description='The best model for scaling across a wide range of tasks',
       input_token_limit=30720,
       output_token_limit=2048,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=0.9,
       max_temperature=None,
       top_p=1.0,
       top_k=None),
 Model(name='models/gemini-1.0-pro-001',
       base_model_id='',
       version='001',
       display_name='Gemini 1.0 Pro 001 (Tuning)',
       description=('The best model for scaling across a wide range of tasks. This is a stable '
                    'model that supports tuning.'),
       input_token_limit=30720,
       output_token_limit=2048,
       supported_generation_methods=['generateContent', 'countTokens', 'createTunedModel'],
       temperature=0.9,
       max_temperature=None,
       top_p=1.0,
       top_k=None),
 Model(name='models/gemini-1.0-pro-vision-latest',
       base_model_id='',
       version='001',
       display_name='Gemini 1.0 Pro Vision',
       description='The best image understanding model to handle a broad range of applications',
       input_token_limit=12288,
       output_token_limit=4096,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=0.4,
       max_temperature=None,
       top_p=1.0,
       top_k=32),
 Model(name='models/gemini-pro-vision',
       base_model_id='',
       version='001',
       display_name='Gemini 1.0 Pro Vision',
       description='The best image understanding model to handle a broad range of applications',
       input_token_limit=12288,
       output_token_limit=4096,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=0.4,
       max_temperature=None,
       top_p=1.0,
       top_k=32),
 Model(name='models/gemini-1.5-pro-latest',
       base_model_id='',
       version='001',
       display_name='Gemini 1.5 Pro Latest',
       description='Mid-size multimodal model that supports up to 2 million tokens',
       input_token_limit=2000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=64),
 Model(name='models/gemini-1.5-pro-001',
       base_model_id='',
       version='001',
       display_name='Gemini 1.5 Pro 001',
       description='Mid-size multimodal model that supports up to 2 million tokens',
       input_token_limit=2000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens', 'createCachedContent'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=64),
 Model(name='models/gemini-1.5-pro-002',
       base_model_id='',
       version='002',
       display_name='Gemini 1.5 Pro 002',
       description='Mid-size multimodal model that supports up to 2 million tokens',
       input_token_limit=2000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=40),
 Model(name='models/gemini-1.5-pro',
       base_model_id='',
       version='001',
       display_name='Gemini 1.5 Pro',
       description='Mid-size multimodal model that supports up to 2 million tokens',
       input_token_limit=2000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=64),
 Model(name='models/gemini-1.5-pro-exp-0801',
       base_model_id='',
       version='exp-0801',
       display_name='Gemini 1.5 Pro Experimental 0801',
       description='Mid-size multimodal model that supports up to 2 million tokens',
       input_token_limit=2000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=64),
 Model(name='models/gemini-1.5-pro-exp-0827',
       base_model_id='',
       version='exp-0827',
       display_name='Gemini 1.5 Pro Experimental 0827',
       description='Mid-size multimodal model that supports up to 2 million tokens',
       input_token_limit=2000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=64),
 Model(name='models/gemini-1.5-flash-latest',
       base_model_id='',
       version='001',
       display_name='Gemini 1.5 Flash Latest',
       description='Fast and versatile multimodal model for scaling across diverse tasks',
       input_token_limit=1000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=64),
 Model(name='models/gemini-1.5-flash-001',
       base_model_id='',
       version='001',
       display_name='Gemini 1.5 Flash 001',
       description='Fast and versatile multimodal model for scaling across diverse tasks',
       input_token_limit=1000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens', 'createCachedContent'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=64),
 Model(name='models/gemini-1.5-flash-001-tuning',
       base_model_id='',
       version='001',
       display_name='Gemini 1.5 Flash 001 Tuning',
       description='Fast and versatile multimodal model for scaling across diverse tasks',
       input_token_limit=16384,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens', 'createTunedModel'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=64),
 Model(name='models/gemini-1.5-flash',
       base_model_id='',
       version='001',
       display_name='Gemini 1.5 Flash',
       description='Fast and versatile multimodal model for scaling across diverse tasks',
       input_token_limit=1000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=64),
 Model(name='models/gemini-1.5-flash-exp-0827',
       base_model_id='',
       version='exp-0827',
       display_name='Gemini 1.5 Flash Experimental 0827',
       description='Fast and versatile multimodal model for scaling across diverse tasks',
       input_token_limit=1000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=64),
 Model(name='models/gemini-1.5-flash-8b-exp-0827',
       base_model_id='',
       version='001',
       display_name='Gemini 1.5 Flash 8B Experimental 0827',
       description='Fast and versatile multimodal model for scaling across diverse tasks',
       input_token_limit=1000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=40),
 Model(name='models/gemini-1.5-flash-8b-exp-0924',
       base_model_id='',
       version='001',
       display_name='Gemini 1.5 Flash 8B Experimental 0924',
       description='Fast and versatile multimodal model for scaling across diverse tasks',
       input_token_limit=1000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=40),
 Model(name='models/gemini-1.5-flash-002',
       base_model_id='',
       version='002',
       display_name='Gemini 1.5 Flash 002',
       description='Fast and versatile multimodal model for scaling across diverse tasks',
       input_token_limit=1000000,
       output_token_limit=8192,
       supported_generation_methods=['generateContent', 'countTokens'],
       temperature=1.0,
       max_temperature=2.0,
       top_p=0.95,
       top_k=40),
 Model(name='models/embedding-001',
       base_model_id='',
       version='001',
       display_name='Embedding 001',
       description='Obtain a distributed representation of a text.',
       input_token_limit=2048,
       output_token_limit=1,
       supported_generation_methods=['embedContent'],
       temperature=None,
       max_temperature=None,
       top_p=None,
       top_k=None),
 Model(name='models/text-embedding-004',
       base_model_id='',
       version='004',
       display_name='Text Embedding 004',
       description='Obtain a distributed representation of a text.',
       input_token_limit=2048,
       output_token_limit=1,
       supported_generation_methods=['embedContent'],
       temperature=None,
       max_temperature=None,
       top_p=None,
       top_k=None),
 Model(name='models/aqa',
       base_model_id='',
       version='001',
       display_name='Model that performs Attributed Question Answering.',
       description=('Model trained to return answers to questions that are grounded in provided '
                    'sources, along with estimating answerable probability.'),
       input_token_limit=7168,
       output_token_limit=1024,
       supported_generation_methods=['generateAnswer'],
       temperature=0.2,
       max_temperature=None,
       top_p=1.0,
       top_k=40)]

models/ は省略できるようだ。

'gemini-1.5-pro-latest' を使って何か聞いてみよう:

model = genai.GenerativeModel('gemini-1.5-pro-latest')

response = model.generate_content("What is the meaning of life?")
print(response.text)

もし検閲に引っかかるようなら response.prompt_feedback を表示してみれば詳細がわかる。

上の方法では一つしか質問できない。会話は、次のようにする。

model = genai.GenerativeModel('gemini-1.5-pro-latest')

chat = model.start_chat(history=[])
response = chat.send_message("What is your knowledge cutoff?")
print(response.text)
response = chat.send_message("What are reputable news sources?")
print(response.text)
...

会話全体は chat.history で参照できる。

for h in chat.history:
    print(f"[{h.role}]\n{h.parts[0].text}")

温度(ランダムさの度合い)などを指定することもできる:

config = genai.GenerationConfig(temperature=0)
model = genai.GenerativeModel('gemini-1.5-pro-latest',
                              generation_config=config)