Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to the Power Users community on Codidact!

Power Users is a Q&A site for questions about the usage of computer software and hardware. We are still a small site and would like to grow, so please consider joining our community. We are looking forward to your questions and answers; they are the building blocks of a repository of knowledge we are building together.

Comments on What's the maximum hit rate, if any, when using Claude, Gemini, Llama and Mistral via Google Cloud Compute?

Post

What's the maximum hit rate, if any, when using Claude, Gemini, Llama and Mistral via Google Cloud Compute?

+0
−5

What's the maximum hit rate, if any, when using Claude, Gemini, Llama and Mistral via Google Cloud Compute? (Example of maximum hit rate: 1M input tokens/minutes)

I don't use provisioned throughput.


I call Gemini as follows:

YOUR_PROJECT_ID = 'redacted'
YOUR_LOCATION = 'us-central1'
from google import genai
client = genai.Client(
 vertexai=True, project=YOUR_PROJECT_ID, location=YOUR_LOCATION,
)
model = "gemini-2.5-pro-exp-03-25"
response = client.models.generate_content(
 model=model,
 contents=[
   "Tell me a joke about alligators"
 ],
)
print(response.text, end="")
History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

3 comment threads

Not a good fit for a Q&A site (2 comments)
If you cross-post the same question on multiple sites, you should include links to all other versions... (10 comments)
What have you tried? (2 comments)
Not a good fit for a Q&A site
Karl Knechtel‭ wrote 3 days ago · edited 3 days ago

While it's perfectly on topic to have questions about how to use LLM and/or cloud computing services generally - insofar as there really is a generic way to do it - I don't think we should be trying to provide customer support for individual providers. Any rate limits on a Google service are Google's responsibility; the only ethical way for us to come up with an answer would be to ask Google directly or find Google documentation. In other words, it's not something where people can contribute their own expertise.

In my view, that's against the spirit of how Q&A sites are intended to work - whether that's Codidact, the Stack Exchange network or anything else - and simply out of scope.

However - if, for example, there's some technical reason why evaluating your process (to see whether you were in danger of hitting rate limits) were non-trivial, perhaps you could ask a question about how to make that evaluation.

Karl Knechtel‭ how do you define customer support questions? Almost all questions on this site are about how to use some product, which to me sounds like customer support questions (which is fine with me), e.g.: