Testing Auth0 FGA Latency and Throughput
Overview
A good mental model for thinking about Auth0 FGA's performance is to think of it as a database. You can't answer "what is the latency of a database?" or "how many requests per second can it handle?" without additional context. The answers depend on the database schema, the data stored in the database, the queries, and the query distribution, which impacts how much data the database can cache.
Performance Testing Strategy
Understanding Performance Dependencies
Auth0 FGA's performance depends on several factors:
- Model complexity: Nested relationships and authorization rules.
- Data distribution: How tuples are distributed across relations.
- Query patterns: Types of requests (Check vs. ListObjects/ListUsers).
- Cache ratio: How often data can be served from cache vs. computed.
Test Design Principles
When designing a test with Auth0 FGA, it is crucial to consider a realistic usage pattern. Some synthetic tests can misrepresent the actual performance. Tests that approximate real-world usage will provide a more accurate representation of how your application will perform.
1. Design Realistic Data Distribution
For example, if you are modeling a simple document management system like the one below:
type user
type organization
relations
define admin: [user]
type folder
relations
define parent: [folder]
define organization: [organization]
define owner: [user]
define viewer: owner or viewer from parent or admin from organization
type document
relations
define parent: [folder]
define owner: [user]
define viewer: owner or viewer from parent
You should consider:
- Tuple distribution: Write tuples that follow a real-world distribution for each relation. For example, define some organizations with the maximum number of admins that you have, others with the average number of admins, and others with the minimum number of admins.
- Hierarchy depth: Pay special attention to nested relationships like folder->parent, where a deeply nested hierarchy can impact performance. Use realistic nesting levels.
2. Model Realistic Usage Patterns
Consider how requests would be distributed during your test period:
- User activity: If your application has 1M users, they won't all be using the application for 10 minutes. Pick a realistic percentage of users based on a real-world approximation of your application.
- Resource access: If each user can access a large number of resources, they would not access them all in a short period. Pick a realistic number of resources based on a real-world approximation of your application.
- API call distribution: Pick a realistic distribution for Check/ListObjects/ListUsers calls. Auth0 FGA is designed to resolve Check requests at high scale with low latency, but ListObjects/ListUsers are expensive to resolve and are generally called much less frequently.
3. Balance Query Types
- Positive vs. negative checks: Negative checks are more expensive to evaluate than positive checks, but positive checks are far more common. Maintain a realistic balance between both when designing tests.
- Consistency requirements: Use the default consistency level for Check (minimize latency) unless you know for certain you'll need a percentage of queries to be run with higher consistency.
Latency Testing
Test Configuration
-
RPS requirements: To test latency, you don't need to perform a high number of Requests Per Second (RPS). Auth0 FGA scales horizontally well, and you will not see a significant increase in latency with more load. The trial account rate limits should be sufficient for this.
-
Cache warm-up: Auth0 FGA has a multi-level cache. To ensure your tests make use of the cache, perform at least 1 RPS for at least one minute, or the duration of the test.
SDK and Client Configuration
-
Reuse client instances: It is recommended to use the OpenFGA SDKs, reusing the same FGA Client instance for all calls. Each client instance requests an access token using the OAuth Client Credentials flow and caches it, and also uses the same HTTP connection pool. If you create an instance per request, you'll incur significant overhead per call.
-
Without SDKs: If you are not using the SDKs, make sure you request the access token once and send it in each request, and ensure you use your platform's HTTP connection pool. Do not request a new token with every request. This can significantly impact performance.
-
Always set the
authorization_model_idparameter: This parameter ensures that Auth0 FGA can directly use the specified authorization model for queries, avoiding an extra database read to fetch the latest model. Omitting it may increase latency and reduce performance, especially under high load.
Infrastructure Considerations
-
Geographic location: When creating a Auth0 FGA store, you can pick a jurisdiction (US, Europe, Australia). These are hosted in AWS us-east-1, us-west-2 for US; eu-west-1 and eu-central-1 for Europe; and ap-southeast-2 and ap-southeast-4 for Australia. The latency will depend on how far your application services are from the store's AWS regions.
-
Test environment: Run the tests from a cloud server in the same region where you would deploy your applications. Running tests from your local machine won't provide an accurate representation of latency.
Monitoring and Measurement
- Observability: Consider using the OpenTelemetry support to measure RPS/Latency on your side. Auth0 FGA returns an
fga-query-duration-msheader that you can use to understand how long the query took on the server. This is also reported by OpenTelemetry in the SDKs.
Load Testing
If you want to perform a load test, the free trial rate limits won't be sufficient. Please contact your Account Executive to discuss the best approach to run it.
Auth0 FGA infrastructure gradually scales up on demand. During the ramp up period, you will hit lower rate-limits as the system scales up to accommodate your requests. To avoid those lower limits, we recommend that you ramp up requests by 100 RPS every minute once your account is configured.
Data Import and Setup
Importing Tuples with the CLI
To import tuples, we recommend using the FGA CLI from a CSV/JSON/YAML file:
fga tuple write --store-id=<store_id> --file <file_name>.csv --max-rps <value>
You should use the following values:
--max-rps: the maximum RPS that your account supports. Rate limits are documented here.--hide-imported-tuples: to avoid having all successful tuples to be written tostdout.
If you are using a trial version, you should use:
fga tuple write --store-id=<store_id> --file tuples.csv --max-rps 20 --hide-imported-tuples
If you are using an Enterprise version with 150 RPS limit, you should use:
fga tuple write --store-id=<store_id> --file tuples.csv --max-rps 150 --hide-imported-tuples
Retrying tuples with errors
In some cases you could want to retry failed tuples (e.g. network connectivity error). To achieve that, you can direct the output to a file:
fga tuple write --file tuples.json --hide-imported-tuples > results.json
Then, process the file with jq to convert it to format that you can send the CLI again:
jq -c '[.failed[] | {user: .tuple_key.user, relation: .tuple_key.relation, object: .tuple_key.object}]' results.json > failed_tuples.json
fga tuple write --file failed_tuples.json --hide-imported-tuples