Designing Firestore Database Models to Minimize Costs

January 16, 2025 - Firebase Firestore

Designing Firestore Database Models to Minimize Costs

When transitioning from relational databases to NoSQL databases like Firebase Firestore, developers often overlook the critical costs associated with reads, writes, and storage. Unlike relational databases, where normalization and foreign key relationships are the norm, Firestore’s cost model necessitates a different approach to data modeling. By understanding Firestore’s pricing model and carefully designing the database structure, you can significantly reduce operational costs while maintaining performance.

Key Firestore Pricing Factors

Reads: Each document read (even for a single field) counts as one read operation. Writes: Updating or adding documents incurs write costs. Storage: Firestore charges for the total size of documents stored, including indexes.

Principles for Cost-Effective Firestore Modeling

1. Denormalization vs. Normalization

In relational databases, normalization is essential to eliminate redundancy. However, Firestore encourages denormalization to reduce reads. That said, excessive denormalization can lead to increased write costs if data updates propagate to multiple locations. Balancing denormalization with selective normalization is key.

2. Evaluate Read and Write Patterns

Analyze how frequently data is read versus written. High read frequencies favor denormalization, whereas frequent writes may necessitate normalization to avoid duplicating updates across multiple documents.

3. Leverage Relational IDs for Frequently Updated Data

For entities that are frequently updated, such as user profiles in dynamic environments like social media, storing references (e.g., user IDs) instead of embedding the full object can drastically reduce write costs. Fetching the referenced data at runtime ensures that updates do not cascade to multiple documents.

4. Embed Static or Rarely Updated Data

In scenarios where the data seldom changes, embedding the entire object reduces read costs by avoiding additional queries. For example, embedding user details in comments for a hotel booking system, where profile updates are infrequent, minimizes the overall read and write operations.

Use Cases

Use Case 1: Social Media Profiles (High Update Frequency)

Imagine a LinkedIn-like application where users frequently update their profiles. If user details are embedded in posts or comments, every profile update would necessitate propagating changes to all related documents. Instead, store a reference to the user ID:

Posts Collection:

{
  "postId": "123",
  "content": "Excited to share this update!",
  "authorId": "user_456"
}

Users Collection:

{
  "userId": "user_456",
  "name": "Jane Doe",
  "profilePicture": "https://...",
  "headline": "Software Engineer"
}

When displaying a post, fetch the user data using the authorId. While this adds a read operation, it eliminates the need to update multiple documents whenever the user profile changes.

Use Case 2: Hotel Booking Profiles (Low Update Frequency)

In a hotel booking system, user profiles are updated infrequently. Embedding user data directly into related documents such as reviews or booking history reduces read costs:

Reviews Collection:

{
  "reviewId": "789",
  "hotelId": "hotel_123",
  "user": {
    "userId": "user_456",
    "name": "John Smith",
    "email": "john.smith@example.com"
  },
  "rating": 5,
  "comment": "Fantastic stay!"
}

In this scenario, the minimal write frequency for user profiles ensures that embedding the data does not lead to high propagation costs.

Practical Guidelines

Profile High-Traffic Entities: Use analytics to determine which entities are read or written frequently. Tailor your model based on usage patterns.

Combine Data Where Necessary: For pages requiring multiple queries, consider aggregating related data into a single document to reduce reads.

Use Subcollections for Scalability: For one-to-many relationships, subcollections can group related data while maintaining granularity.

Limit Indexes: Each field indexed in Firestore increases storage costs. Only index fields essential for querying.

Conclusion

Firestore’s pricing model rewards thoughtful data structuring. By strategically balancing denormalization and normalization, using relational IDs for frequently updated entities, and embedding static data, you can optimize costs while maintaining performance. Evaluate your app’s read and write patterns regularly to adapt your data model as user behavior evolves.

Follow me on my Linkedin for more tech stuff or if you have some cool idea to discuss lets connect.