2
0
Fork 0
mirror of https://github.com/Vonng/ddia.git synced 2026-06-21 17:07:12 +08:00
ddia/content/en/indexes.md
2026-02-15 10:54:24 +08:00

3542 lines
328 KiB
Markdown

---
title: Indexes
weight: 550
breadcrumbs: false
---
### Symbols
- 3FS (distributed filesystem, [Distributed Filesystems](/en/ch11#sec_batch_dfs)
### A
- aborts (transactions), [Transactions](/en/ch8#ch_transactions), [Atomicity](/en/ch8#sec_transactions_acid_atomicity)
- cascading, [No dirty reads](/en/ch8#no-dirty-reads)
- in two-phase commit, [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc)
- performance of optimistic concurrency control, [Performance of serializable snapshot isolation](/en/ch8#performance-of-serializable-snapshot-isolation)
- retrying aborted transactions, [Handling errors and aborts](/en/ch8#handling-errors-and-aborts)
- abstraction, [Layering of cloud services](/en/ch1#layering-of-cloud-services), [Simplicity: Managing Complexity](/en/ch2#id38), [Data Models and Query Languages](/en/ch3#ch_datamodels), [Transactions](/en/ch8#ch_transactions), [Summary](/en/ch8#summary)
- accidental complexity, [Simplicity: Managing Complexity](/en/ch2#id38)
- accountability, [Responsibility and Accountability](/en/ch14#id371)
- accounting (financial data), [Summary](/en/ch3#summary), [Advantages of immutable events](/en/ch12#sec_stream_immutability_pros)
- Accumulo (database)
- wide-column data model, [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality), [Column Compression](/en/ch4#sec_storage_column_compression)
- ACID properties (transactions), [The Meaning of ACID](/en/ch8#sec_transactions_acid)
- atomicity, [Atomicity](/en/ch8#sec_transactions_acid_atomicity), [Single-Object and Multi-Object Operations](/en/ch8#sec_transactions_multi_object)
- consistency, [Consistency](/en/ch8#sec_transactions_acid_consistency), [Maintaining integrity in the face of software bugs](/en/ch13#id455)
- durability, [Making B-trees reliable](/en/ch4#sec_storage_btree_wal), [Durability](/en/ch8#durability)
- isolation, [Isolation](/en/ch8#sec_transactions_acid_isolation), [Single-Object and Multi-Object Operations](/en/ch8#sec_transactions_multi_object)
- acknowledgements (messaging), [Acknowledgments and redelivery](/en/ch12#sec_stream_reordering)
- active/active replication (see multi-leader replication)
- active/passive replication (see leader-based replication)
- ActiveMQ (messaging), [Message brokers](/en/ch5#message-brokers), [Message brokers compared to databases](/en/ch12#id297)
- distributed transaction support, [XA transactions](/en/ch8#xa-transactions)
- ActiveRecord (object-relational mapper), [Object-relational mapping (ORM)](/en/ch3#object-relational-mapping-orm), [Handling errors and aborts](/en/ch8#handling-errors-and-aborts)
- activity (workflows) (see workflow engines)
- actor model, [Distributed actor frameworks](/en/ch5#distributed-actor-frameworks)
- (see also event-driven architecture)
- comparison to stream processing, [Event-Driven Architectures and RPC](/en/ch12#sec_stream_actors_drpc)
- adaptive capacity, [Skewed Workloads and Relieving Hot Spots](/en/ch7#sec_sharding_skew)
- Advanced Message Queuing Protocol (see AMQP)
- aerospace systems, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)
- Aerospike (database)
- strong consistency mode, [Single-object writes](/en/ch8#sec_transactions_single_object)
- AGE (graph database), [The Cypher Query Language](/en/ch3#id57)
- aggregation
- data cubes and materialized views, [Materialized Views and Data Cubes](/en/ch4#sec_storage_materialized_views)
- in batch processes, [Sorting Versus In-memory Aggregation](/en/ch11#id275)
- in stream processes, [Stream analytics](/en/ch12#id318)
- aggregation pipeline (MongoDB), [Normalization, Denormalization, and Joins](/en/ch3#sec_datamodels_normalization), [Query languages for documents](/en/ch3#query-languages-for-documents)
- Agile, [Evolvability: Making Change Easy](/en/ch2#sec_introduction_evolvability)
- minimizing irreversibility, [Batch Processing](/en/ch11#ch_batch), [Reprocessing data for application evolution](/en/ch13#sec_future_reprocessing)
- moving faster with confidence, [The end-to-end argument again](/en/ch13#id456)
- agreement, [Single-value consensus](/en/ch10#single-value-consensus), [Atomic commitment as consensus](/en/ch10#atomic-commitment-as-consensus)
- (see also consensus)
- AI (artificial intelligence) (see machine learning)
- AI Act (European Union), [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance)
- AirByte, [Data Warehousing](/en/ch1#sec_introduction_dwh)
- Airflow (workflow scheduler), [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows), [Batch Processing](/en/ch11#ch_batch), [Scheduling Workflows](/en/ch11#sec_batch_workflows)
- cloud data warehouse integration, [Query languages](/en/ch11#sec_batch_query_lanauges)
- use for ETL, [Extract--Transform--Load (ETL)](/en/ch11#sec_batch_etl_usage)
- Akamai
- response time study, [Average, Median, and Percentiles](/en/ch2#id24)
- algorithms
- algorithm correctness, [Defining the correctness of an algorithm](/en/ch9#defining-the-correctness-of-an-algorithm)
- B-trees, [B-Trees](/en/ch4#sec_storage_b_trees)-[B-tree variants](/en/ch4#b-tree-variants)
- for distributed systems, [System Model and Reality](/en/ch9#sec_distributed_system_model)
- mergesort, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables), [Shuffling Data](/en/ch11#sec_shuffle)
- scheduling, [Resource Allocation](/en/ch11#id279)
- SSTables and LSM-trees, [The SSTable file format](/en/ch4#the-sstable-file-format)-[Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- all-to-all replication topologies, [Multi-leader replication topologies](/en/ch6#sec_replication_topologies)
- AllegroGraph (database), [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- SPARQL query language, [The SPARQL query language](/en/ch3#the-sparql-query-language)
- ALTER TABLE statement (SQL), [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility), [Encoding and Evolution](/en/ch5#ch_encoding)
- Amazon
- Dynamo (see Dynamo (database))
- response time study, [Average, Median, and Percentiles](/en/ch2#id24)
- Amazon Web Services (AWS)
- Aurora (see Aurora (cloud database))
- ClockBound (see ClockBound (time sync))
- correctness testing, [Formal Methods and Randomized Testing](/en/ch9#sec_distributed_formal)
- DynamoDB (see DynamoDB (database))
- EBS (see EBS (virtual block device))
- Kinesis (see Kinesis (messaging))
- Neptune (see Neptune (graph database))
- network reliability, [Network Faults in Practice](/en/ch9#sec_distributed_network_faults)
- S3 (see S3 (object storage))
- amplification
- of bias, [Bias and Discrimination](/en/ch14#id370)
- of failures, [Maintaining derived state](/en/ch13#id446)
- of tail latency, [Use of Response Time Metrics](/en/ch2#sec_introduction_slo_sla), [Local Secondary Indexes](/en/ch7#id166)
- write amplification, [Write amplification](/en/ch4#write-amplification)
- AMQP (Advanced Message Queuing Protocol), [Message brokers compared to databases](/en/ch12#id297)
- (see also messaging systems)
- comparison to log-based messaging, [Logs compared to traditional messaging](/en/ch12#sec_stream_logs_vs_messaging), [Replaying old messages](/en/ch12#sec_stream_replay)
- message ordering, [Acknowledgments and redelivery](/en/ch12#sec_stream_reordering)
- analytical systems, [Operational Versus Analytical Systems](/en/ch1#sec_introduction_analytics)
- as derived data systems, [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived)
- ETL from operational systems, [Data Warehousing](/en/ch1#sec_introduction_dwh)
- governance, [Beyond the data lake](/en/ch1#beyond-the-data-lake)
- analytics, [Operational Versus Analytical Systems](/en/ch1#sec_introduction_analytics)-[Systems of Record and Derived Data](/en/ch1#sec_introduction_derived)
- comparison to transaction processing, [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp)
- data normalization, [Trade-offs of normalization](/en/ch3#trade-offs-of-normalization)
- data warehousing (see data warehousing)
- predictive (see predictive analytics)
- relation to batch processing, [Analytics](/en/ch11#sec_batch_olap)-[Analytics](/en/ch11#sec_batch_olap)
- schemas for, [Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)-[Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)
- snapshot isolation for queries, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)
- stream analytics, [Stream analytics](/en/ch12#id318)
- analytics engineering, [Operational Versus Analytical Systems](/en/ch1#sec_introduction_analytics)
- anti-entropy, [Catching up on missed writes](/en/ch6#sec_replication_read_repair)
- Antithesis (deterministic simulation testing), [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- Apache Accumulo (see Accumulo)
- Apache ActiveMQ (see ActiveMQ)
- Apache AGE (see AGE)
- Apache Arrow (see Arrow (data format))
- Apache Avro (see Avro)
- Apache Beam (see Beam)
- Apache BookKeeper (see BookKeeper)
- Apache Cassandra (see Cassandra)
- Apache Curator (see Curator)
- Apache DataFusion (see DataFusion (query engine))
- Apache Druid (see Druid (database))
- Apache Flink (see Flink (processing framework))
- Apache HBase (see HBase)
- Apache Iceberg (see Iceberg (table format))
- Apache Jena (see Jena)
- Apache Kafka (see Kafka)
- Apache Lucene (see Lucene)
- Apache Oozie (see Oozie (workflow scheduler))
- Apache ORC (see ORC (data format))
- Apache Parquet (see Parquet (data format))
- Apache Pig (query language), [Query languages](/en/ch11#sec_batch_query_lanauges)
- Apache Pinot (see Pinot (database))
- Apache Pulsar (see Pulsar)
- Apache Qpid (see Qpid)
- Apache Samza (see Samza)
- Apache Solr (see Solr)
- Apache Spark (see Spark) (see Spark (processing framework))
- Apache Storm (see Storm)
- Apache Superset (see Superset (data visualization software))
- Apache Thrift (see Thrift)
- Apache ZooKeeper (see ZooKeeper)
- Apama (stream analytics), [Complex event processing](/en/ch12#id317)
- append-only files (see logs)
- Application Programming Interfaces (APIs), [Data Models and Query Languages](/en/ch3#ch_datamodels)
- for change streams, [API support for change streams](/en/ch12#sec_stream_change_api)
- for distributed transactions, [XA transactions](/en/ch8#xa-transactions)
- for services, [Dataflow Through Services: REST and RPC](/en/ch5#sec_encoding_dataflow_rpc)-[Data encoding and evolution for RPC](/en/ch5#data-encoding-and-evolution-for-rpc)
- (see also services)
- evolvability, [Data encoding and evolution for RPC](/en/ch5#data-encoding-and-evolution-for-rpc)
- RESTful, [Web services](/en/ch5#sec_web_services)
- application state (see state)
- approximate search (see similarity search)
- archival storage, data from databases, [Archival storage](/en/ch5#archival-storage)
- arcs (see edges)
- ArcticDB (database), [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- arithmetic mean, [Average, Median, and Percentiles](/en/ch2#id24)
- arrays
- array databases, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- multidimensional, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- Arrow (data format), [Column-Oriented Storage](/en/ch4#sec_storage_column), [DataFrames](/en/ch11#id287)
- artificial intelligence (see machine learning)
- ASCII text, [Protocol Buffers](/en/ch5#sec_encoding_protobuf)
- ASN.1 (schema language), [The Merits of Schemas](/en/ch5#sec_encoding_schemas)
- associative table, [Many-to-One and Many-to-Many Relationships](/en/ch3#sec_datamodels_many_to_many), [Property Graphs](/en/ch3#id56)
- asynchronous networks, [Unreliable Networks](/en/ch9#sec_distributed_networks), [Glossary](/en/glossary)
- comparison to synchronous networks, [Synchronous Versus Asynchronous Networks](/en/ch9#sec_distributed_sync_networks)
- system model, [System Model and Reality](/en/ch9#sec_distributed_system_model)
- asynchronous replication, [Synchronous Versus Asynchronous Replication](/en/ch6#sec_replication_sync_async), [Glossary](/en/glossary)
- data loss on failover, [Leader failure: Failover](/en/ch6#leader-failure-failover)
- reads from asynchronous follower, [Problems with Replication Lag](/en/ch6#sec_replication_lag)
- with multiple leaders, [Multi-Leader Replication](/en/ch6#sec_replication_multi_leader)
- Asynchronous Transfer Mode (ATM), [Can we not simply make network delays predictable?](/en/ch9#can-we-not-simply-make-network-delays-predictable)
- atomic broadcast, [Shared logs as consensus](/en/ch10#sec_consistency_shared_logs)
- atomic clocks, [Clock readings with a confidence interval](/en/ch9#clock-readings-with-a-confidence-interval), [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- (see also clocks)
- atomicity (concurrency), [Glossary](/en/glossary)
- atomic increment, [Single-object writes](/en/ch8#sec_transactions_single_object)
- compare-and-set (CAS), [Conditional writes (compare-and-set)](/en/ch8#sec_transactions_compare_and_set), [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- (see also compare-and-set (CAS))
- denormalized data, [Trade-offs of normalization](/en/ch3#trade-offs-of-normalization)
- fetch-and-add/increment, [ID Generators and Logical Clocks](/en/ch10#sec_consistency_logical), [Consensus](/en/ch10#sec_consistency_consensus), [Fetch-and-add as consensus](/en/ch10#fetch-and-add-as-consensus)
- write operations, [Atomic write operations](/en/ch8#atomic-write-operations)
- atomicity (transactions), [Atomicity](/en/ch8#sec_transactions_acid_atomicity), [Single-Object and Multi-Object Operations](/en/ch8#sec_transactions_multi_object), [Glossary](/en/glossary)
- atomic commit
- avoiding, [Multi-shard request processing](/en/ch13#id360), [Coordination-avoiding data systems](/en/ch13#id454)
- blocking and nonblocking, [Three-phase commit](/en/ch8#three-phase-commit)
- in stream processing, [Exactly-once message processing](/en/ch8#sec_transactions_exactly_once), [Exactly-once message processing revisited](/en/ch8#exactly-once-message-processing-revisited), [Atomic commit revisited](/en/ch12#sec_stream_atomic_commit)
- maintaining derived data, [Keeping Systems in Sync](/en/ch12#sec_stream_sync)
- distributed transactions, [Distributed Transactions](/en/ch8#sec_transactions_distributed)-[Exactly-once message processing revisited](/en/ch8#exactly-once-message-processing-revisited)
- for multi-object transactions, [Single-Object and Multi-Object Operations](/en/ch8#sec_transactions_multi_object)
- for single-object writes, [Single-object writes](/en/ch8#sec_transactions_single_object)
- relation to consensus, [Atomic commitment as consensus](/en/ch10#atomic-commitment-as-consensus)
- auditability, [Trust, but Verify](/en/ch13#sec_future_verification)-[Tools for auditable data systems](/en/ch13#id366)
- designing for, [Designing for auditability](/en/ch13#id365)
- self-auditing systems, [Don't just blindly trust what they promise](/en/ch13#id364)
- through immutability, [Advantages of immutable events](/en/ch12#sec_stream_immutability_pros)
- tools for auditable data systems, [Tools for auditable data systems](/en/ch13#id366)
- Aurora (cloud database), [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native)
- Aurora DSQL (database)
- snapshot isolation support, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)
- auto-scaling, [Operations: Automatic or Manual Rebalancing](/en/ch7#sec_sharding_operations)
- Automerge (CRDT library), [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- availability, [Reliability and Fault Tolerance](/en/ch2#sec_introduction_reliability)
- (see also fault tolerance)
- in CAP theorem, [The CAP theorem](/en/ch10#the-cap-theorem)
- in leader election, [Subtleties of consensus](/en/ch10#subtleties-of-consensus)
- in service level agreements (SLAs), [Use of Response Time Metrics](/en/ch2#sec_introduction_slo_sla)
- availability zones, [Tolerating hardware faults through redundancy](/en/ch2#tolerating-hardware-faults-through-redundancy), [Reading Your Own Writes](/en/ch6#sec_replication_ryw)
- Avro (data format), [Avro](/en/ch5#sec_encoding_avro)-[Dynamically generated schemas](/en/ch5#dynamically-generated-schemas)
- dynamically generated schemas, [Dynamically generated schemas](/en/ch5#dynamically-generated-schemas)
- object container files, [But what is the writer's schema?](/en/ch5#but-what-is-the-writers-schema), [Archival storage](/en/ch5#archival-storage)
- reader determining writer's schema, [But what is the writer's schema?](/en/ch5#but-what-is-the-writers-schema)
- schema evolution, [The writer's schema and the reader's schema](/en/ch5#the-writers-schema-and-the-readers-schema)
- use in batch processing, [MapReduce](/en/ch11#sec_batch_mapreduce)
- awk (Unix tool), [Simple Log Analysis](/en/ch11#sec_batch_log_analysis), [Simple Log Analysis](/en/ch11#sec_batch_log_analysis), [Distributed Job Orchestration](/en/ch11#id278)
- Axon Framework, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- Azkaban (workflow scheduler), [Batch Processing](/en/ch11#ch_batch)
- Azure Blob Storage (object storage), [Layering of cloud services](/en/ch1#layering-of-cloud-services), [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- conditional headers, [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens)
- Azure managed disks, [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute)
- Azure SQL DB (database), [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native)
- Azure Storage, [Object Stores](/en/ch11#id277)
- Azure Synapse Analytics (database), [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native)
- Azure Virtual Machines
- spot virtual machines, [Handling Faults](/en/ch11#id281)
### B
- B-trees (indexes), [B-Trees](/en/ch4#sec_storage_b_trees)-[B-tree variants](/en/ch4#b-tree-variants)
- B+ trees, [B-tree variants](/en/ch4#b-tree-variants)
- branching factor, [B-Trees](/en/ch4#sec_storage_b_trees)
- comparison to LSM-trees, [Comparing B-Trees and LSM-Trees](/en/ch4#sec_storage_btree_lsm_comparison)-[Disk space usage](/en/ch4#disk-space-usage)
- crash recovery, [Making B-trees reliable](/en/ch4#sec_storage_btree_wal)
- growing by splitting a page, [B-Trees](/en/ch4#sec_storage_b_trees)
- immutable variants, [B-tree variants](/en/ch4#b-tree-variants), [Indexes and snapshot isolation](/en/ch8#indexes-and-snapshot-isolation)
- similarity to shard splitting, [Rebalancing key-range sharded data](/en/ch7#rebalancing-key-range-sharded-data)
- variants, [B-tree variants](/en/ch4#b-tree-variants)
- B2 (object storage), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- Backblaze B2 (see B2 (object storage))
- backend, [Trade-offs in Data Systems Architecture](/en/ch1#ch_tradeoffs)
- backoff, exponential, [Describing Performance](/en/ch2#sec_introduction_percentiles), [Handling errors and aborts](/en/ch8#handling-errors-and-aborts)
- backpressure, [Describing Performance](/en/ch2#sec_introduction_percentiles), [Read performance](/en/ch4#read-performance), [Messaging Systems](/en/ch12#sec_stream_messaging), [Glossary](/en/glossary)
- in batch processing, [Scheduling Workflows](/en/ch11#sec_batch_workflows)
- in TCP, [The Limitations of TCP](/en/ch9#sec_distributed_tcp)
- backups
- database snapshot for replication, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- in multitenant systems, [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- integrity of, [Don't just blindly trust what they promise](/en/ch13#id364)
- snapshot isolation for, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)
- using object storage, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- versus replication, [Replication](/en/ch6#ch_replication)
- backward compatibility, [Encoding and Evolution](/en/ch5#ch_encoding)
- BadgerDB (database)
- serializable transactions, [Serializable Snapshot Isolation (SSI)](/en/ch8#sec_transactions_ssi)
- BASE, contrast to ACID, [The Meaning of ACID](/en/ch8#sec_transactions_acid)
- bash shell (Unix), [Storage and Indexing for OLTP](/en/ch4#sec_storage_oltp)
- batch processing, [Batch Processing](/en/ch11#ch_batch)-[Summary](/en/ch11#id292), [Glossary](/en/glossary)
- and functional programming, [MapReduce](/en/ch11#sec_batch_mapreduce)
- benefits of, [Batch Processing](/en/ch11#ch_batch)
- combining with stream processing, [Unifying batch and stream processing](/en/ch13#id338)
- comparison to stream processing, [Processing Streams](/en/ch12#sec_stream_processing)
- dataflow engines, [Dataflow Engines](/en/ch11#sec_batch_dataflow)-[Dataflow Engines](/en/ch11#sec_batch_dataflow)
- fault tolerance, [Handling Faults](/en/ch11#id281), [Messaging Systems](/en/ch12#sec_stream_messaging)
- for data integration, [Batch and Stream Processing](/en/ch13#sec_future_batch_streaming)-[Unifying batch and stream processing](/en/ch13#id338)
- graphs and iterative processing, [Machine Learning](/en/ch11#id290)
- high-level APIs and languages, [Query languages](/en/ch11#sec_batch_query_lanauges)-[Query languages](/en/ch11#sec_batch_query_lanauges)
- in cloud data warehouses, [Query languages](/en/ch11#sec_batch_query_lanauges)
- in distributed systems, [Batch Processing in Distributed Systems](/en/ch11#sec_batch_distributed)
- join and group by, [JOIN and GROUP BY](/en/ch11#sec_batch_join)-[JOIN and GROUP BY](/en/ch11#sec_batch_join)
- limitations, [Batch Processing](/en/ch11#ch_batch)
- log-based messaging and, [Replaying old messages](/en/ch12#sec_stream_replay)
- maintaining derived state, [Maintaining derived state](/en/ch13#id446)
- measuring performance, [Batch Processing](/en/ch11#ch_batch)
- models of, [Batch Processing Models](/en/ch11#id431)
- resource allocation, [Resource Allocation](/en/ch11#id279)-[Resource Allocation](/en/ch11#id279)
- resource managers, [Distributed Job Orchestration](/en/ch11#id278)
- schedulers, [Distributed Job Orchestration](/en/ch11#id278)
- serving derived data, [Serving Derived Data](/en/ch11#sec_batch_serving_derived)-[Serving Derived Data](/en/ch11#sec_batch_serving_derived)
- shuffling data, [Shuffling Data](/en/ch11#sec_shuffle)-[Shuffling Data](/en/ch11#sec_shuffle)
- task execution, [Distributed Job Orchestration](/en/ch11#id278)
- use cases, [Batch Use Cases](/en/ch11#sec_batch_output)-[Serving Derived Data](/en/ch11#sec_batch_serving_derived)
- using Unix tools (example), [Batch Processing with Unix Tools](/en/ch11#sec_batch_unix)-[Sorting Versus In-memory Aggregation](/en/ch11#id275)
- batch processing frameworks
- comparison to operating systems, [Batch Processing in Distributed Systems](/en/ch11#sec_batch_distributed)
- Beam (dataflow library), [Unifying batch and stream processing](/en/ch13#id338)
- BERT (language model), [Vector Embeddings](/en/ch4#id92)
- bias, [Bias and Discrimination](/en/ch14#id370)
- bidirectional replication (see multi-leader replication)
- big ball of mud, [Simplicity: Managing Complexity](/en/ch2#id38)
- big data
- versus data minimization, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance), [Legislation and Self-Regulation](/en/ch14#sec_future_legislation)
- BigQuery (database), [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses), [Batch Processing](/en/ch11#ch_batch)
- DataFrames, [Query languages](/en/ch11#sec_batch_query_lanauges)
- sharding and clustering, [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- shuffling data, [Shuffling Data](/en/ch11#sec_shuffle)
- snapshot isolation support, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)
- Bigtable (database)
- sharding scheme, [Sharding by Key Range](/en/ch7#sec_sharding_key_range)
- storage layout, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- tablets (sharding), [Sharding](/en/ch7#ch_sharding)
- wide-column data model, [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality), [Column Compression](/en/ch4#sec_storage_column_compression)
- binary data encodings, [Binary encoding](/en/ch5#binary-encoding)-[The Merits of Schemas](/en/ch5#sec_encoding_schemas)
- Avro, [Avro](/en/ch5#sec_encoding_avro)-[Dynamically generated schemas](/en/ch5#dynamically-generated-schemas)
- MessagePack, [Binary encoding](/en/ch5#binary-encoding)-[Binary encoding](/en/ch5#binary-encoding)
- Protocol Buffers, [Protocol Buffers](/en/ch5#sec_encoding_protobuf)-[Field tags and schema evolution](/en/ch5#field-tags-and-schema-evolution)
- binary encoding
- based on schemas, [The Merits of Schemas](/en/ch5#sec_encoding_schemas)
- by network drivers, [The Merits of Schemas](/en/ch5#sec_encoding_schemas)
- binary strings, lack of support in JSON and XML, [JSON, XML, and Binary Variants](/en/ch5#sec_encoding_json)
- Bitcoin (cryptocurrency), [Tools for auditable data systems](/en/ch13#id366)
- Byzantine fault tolerance, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)
- concurrency bugs in exchanges, [Weak Isolation Levels](/en/ch8#sec_transactions_isolation_levels)
- bitmap indexes, [Column Compression](/en/ch4#sec_storage_column_compression)
- BitTorrent uTP protocol, [The Limitations of TCP](/en/ch9#sec_distributed_tcp)
- Bkd-trees (indexes), [Multidimensional and Full-Text Indexes](/en/ch4#sec_storage_multidimensional)
- blameless postmortems, [Humans and Reliability](/en/ch2#id31)
- Blazegraph (database), [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- SPARQL query language, [The SPARQL query language](/en/ch3#the-sparql-query-language)
- blob storage (see object storage)
- block (file system), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- block device (disk), [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute)
- blockchains, [Summary](/en/ch3#summary)
- Byzantine fault tolerance, [Byzantine Faults](/en/ch9#sec_distributed_byzantine), [Consensus](/en/ch10#sec_consistency_consensus), [Tools for auditable data systems](/en/ch13#id366)
- blocking atomic commit, [Three-phase commit](/en/ch8#three-phase-commit)
- Bloom filter (algorithm), [Bloom filters](/en/ch4#bloom-filters), [Read performance](/en/ch4#read-performance), [Stream analytics](/en/ch12#id318)
- BookKeeper (replicated log), [Allocating work to nodes](/en/ch10#allocating-work-to-nodes)
- bounded datasets, [Stream Processing](/en/ch12#ch_stream), [Glossary](/en/glossary)
- (see also batch processing)
- bounded delays, [Glossary](/en/glossary)
- in networks, [Synchronous Versus Asynchronous Networks](/en/ch9#sec_distributed_sync_networks)
- process pauses, [Response time guarantees](/en/ch9#sec_distributed_clocks_realtime)
- broadcast
- total order broadcast (see shared logs)
- brokerless messaging, [Direct messaging from producers to consumers](/en/ch12#id296)
- Brubeck (metrics aggregator), [Direct messaging from producers to consumers](/en/ch12#id296)
- BTM (transaction coordinator), [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc)
- Buf
- Bufstream (messaging), [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- Bufstream (messaging), [Disk space usage](/en/ch12#sec_stream_disk_usage)
- build or buy, [Cloud Versus Self-Hosting](/en/ch1#sec_introduction_cloud)
- bursty network traffic patterns, [Can we not simply make network delays predictable?](/en/ch9#can-we-not-simply-make-network-delays-predictable)
- business analyst, [Operational Versus Analytical Systems](/en/ch1#sec_introduction_analytics), [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake)
- business data processing, [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp)
- business intelligence, [Operational Versus Analytical Systems](/en/ch1#sec_introduction_analytics)-[Data Warehousing](/en/ch1#sec_introduction_dwh)
- Business Process Execution Language (BPEL), [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows)
- Business Process Model and Notation (BPMN), [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows)
- example, [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows)
- byte sequence, encoding data in, [Formats for Encoding Data](/en/ch5#sec_encoding_formats)
- Byzantine faults, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)-[Weak forms of lying](/en/ch9#weak-forms-of-lying), [System Model and Reality](/en/ch9#sec_distributed_system_model), [Glossary](/en/glossary)
- Byzantine fault-tolerant systems, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)
- Byzantine Generals Problem, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)
- consensus algorithms and, [Consensus](/en/ch10#sec_consistency_consensus), [Tools for auditable data systems](/en/ch13#id366)
### C
- caches, [Keeping everything in memory](/en/ch4#sec_storage_inmemory), [Glossary](/en/glossary)
- and materialized views, [Materialized Views and Data Cubes](/en/ch4#sec_storage_materialized_views)
- as derived data, [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived), [Composing Data Storage Technologies](/en/ch13#id447)-[Unbundled versus integrated systems](/en/ch13#id448)
- in CPUs, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized), [Linearizability and network delays](/en/ch10#linearizability-and-network-delays)
- invalidation and maintenance, [Keeping Systems in Sync](/en/ch12#sec_stream_sync), [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- linearizability, [Linearizability](/en/ch10#sec_consistency_linearizability)
- local disks in the cloud, [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute)
- calendar sync, [Sync Engines and Local-First Software](/en/ch6#sec_replication_offline_clients), [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- California Consumer Privacy Act (CCPA), [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance)
- Camunda (workflow engine), [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows)
- canonical version (of data), [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived)
- CAP theorem, [The CAP theorem](/en/ch10#the-cap-theorem)-[The CAP theorem](/en/ch10#the-cap-theorem), [Glossary](/en/glossary)
- capacity planning, [Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
- Cap'n Proto (data format), [Formats for Encoding Data](/en/ch5#sec_encoding_formats)
- carbon emissions, [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed)
- cascading aborts, [No dirty reads](/en/ch8#no-dirty-reads)
- cascading failures, [Software faults](/en/ch2#software-faults), [Operations: Automatic or Manual Rebalancing](/en/ch7#sec_sharding_operations), [Timeouts and Unbounded Delays](/en/ch9#sec_distributed_queueing)
- Cassandra (database)
- change data capture, [Implementing change data capture](/en/ch12#id307), [API support for change streams](/en/ch12#sec_stream_change_api)
- compaction strategy, [Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- consistency level ANY, [Single-Leader Versus Leaderless Replication Performance](/en/ch6#sec_replication_leaderless_perf)
- hash-range sharding, [Sharding by Hash of Key](/en/ch7#sec_sharding_hash), [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- last-write-wins conflict resolution, [Detecting Concurrent Writes](/en/ch6#sec_replication_concurrent)
- leaderless replication, [Leaderless Replication](/en/ch6#sec_replication_leaderless)
- lightweight transactions, [Single-object writes](/en/ch8#sec_transactions_single_object)
- linearizability, lack of, [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable)
- log-structured storage, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- multi-region support, [Multi-region operation](/en/ch6#multi-region-operation)
- secondary indexes, [Local Secondary Indexes](/en/ch7#id166)
- use of clocks, [Limitations of Quorum Consistency](/en/ch6#sec_replication_quorum_limitations), [Timestamps for ordering events](/en/ch9#sec_distributed_lww)
- vnodes (sharding), [Sharding](/en/ch7#ch_sharding)
- cat (Unix tool), [Simple Log Analysis](/en/ch11#sec_batch_log_analysis)
- catalog, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- causal context, [Version vectors](/en/ch6#version-vectors)
- (see also causal dependencies)
- causal dependencies, [The "happens-before" relation and concurrency](/en/ch6#sec_replication_happens_before)-[Version vectors](/en/ch6#version-vectors)
- capturing, [Version vectors](/en/ch6#version-vectors), [Ordering events to capture causality](/en/ch13#sec_future_capture_causality), [Reads are events too](/en/ch13#sec_future_read_events)
- by total ordering, [The limits of total ordering](/en/ch13#id335)
- in transactions, [Decisions based on an outdated premise](/en/ch8#decisions-based-on-an-outdated-premise)
- sending message to friends (example), [Ordering events to capture causality](/en/ch13#sec_future_capture_causality)
- causality, [Glossary](/en/glossary)
- causal ordering
- total order consistent with, [Logical Clocks](/en/ch10#sec_consistency_timestamps)
- consistency with, [Logical Clocks](/en/ch10#sec_consistency_timestamps)-[Enforcing constraints using logical clocks](/en/ch10#enforcing-constraints-using-logical-clocks)
- happens-before relation, [The "happens-before" relation and concurrency](/en/ch6#sec_replication_happens_before)
- in serializable transactions, [Decisions based on an outdated premise](/en/ch8#decisions-based-on-an-outdated-premise)-[Detecting writes that affect prior reads](/en/ch8#sec_detecting_writes_affect_reads)
- mismatch with clocks, [Timestamps for ordering events](/en/ch9#sec_distributed_lww)
- ordering events to capture, [Ordering events to capture causality](/en/ch13#sec_future_capture_causality)
- violations of, [Consistent Prefix Reads](/en/ch6#sec_replication_consistent_prefix), [Problems with different topologies](/en/ch6#problems-with-different-topologies), [Timestamps for ordering events](/en/ch9#sec_distributed_lww)
- with synchronized clocks, [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- cell-based architecture, [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- CEP (see complex event processing)
- CephFS (distributed filesystem), [Batch Processing](/en/ch11#ch_batch), [Object Stores](/en/ch11#id277)
- certificate transparency, [Tools for auditable data systems](/en/ch13#id366)
- cgroups, [Distributed Job Orchestration](/en/ch11#id278)
- change data capture, [Logical (row-based) log replication](/en/ch6#logical-row-based-log-replication), [Change Data Capture](/en/ch12#sec_stream_cdc)
- API support for change streams, [API support for change streams](/en/ch12#sec_stream_change_api)
- comparison to event sourcing, [Change data capture versus event sourcing](/en/ch12#sec_stream_event_sourcing)
- implementing, [Implementing change data capture](/en/ch12#id307)
- initial snapshot, [Initial snapshot](/en/ch12#sec_stream_cdc_snapshot)
- log compaction, [Log compaction](/en/ch12#sec_stream_log_compaction)
- changelogs, [State, Streams, and Immutability](/en/ch12#sec_stream_immutability)
- change data capture, [Change Data Capture](/en/ch12#sec_stream_cdc)
- for operator state, [Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance)
- in stream joins, [Stream-table join (stream enrichment)](/en/ch12#sec_stream_table_joins)
- log compaction, [Log compaction](/en/ch12#sec_stream_log_compaction)
- maintaining derived state, [Databases and Streams](/en/ch12#sec_stream_databases)
- chaos engineering, [Fault Tolerance](/en/ch2#id27), [Fault injection](/en/ch9#sec_fault_injection)
- checkpointing
- in high-performance computing, [Cloud Computing Versus Supercomputing](/en/ch1#id17)
- in stream processors, [Microbatching and checkpointing](/en/ch12#id329)
- circuit breaker (limiting retries), [Describing Performance](/en/ch2#sec_introduction_percentiles)
- circuit-switched networks, [Synchronous Versus Asynchronous Networks](/en/ch9#sec_distributed_sync_networks)
- circular buffers, [Disk space usage](/en/ch12#sec_stream_disk_usage)
- circular replication topologies, [Multi-leader replication topologies](/en/ch6#sec_replication_topologies)
- Citus (database)
- hash sharding, [Fixed number of shards](/en/ch7#fixed-number-of-shards)
- ClickHouse (database), [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp), [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native)
- incremental view maintenance, [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- clickstream data, analysis of, [JOIN and GROUP BY](/en/ch11#sec_batch_join)
- clients
- calling services, [Dataflow Through Services: REST and RPC](/en/ch5#sec_encoding_dataflow_rpc)
- offline-capable, [Sync Engines and Local-First Software](/en/ch6#sec_replication_offline_clients), [Stateful, offline-capable clients](/en/ch13#id347)
- pushing state changes to, [Pushing state changes to clients](/en/ch13#id348)
- request routing, [Request Routing](/en/ch7#sec_sharding_routing)
- ClockBound (time sync), [Clock readings with a confidence interval](/en/ch9#clock-readings-with-a-confidence-interval)
- use in YugabyteDB, [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- clocks, [Unreliable Clocks](/en/ch9#sec_distributed_clocks)-[Limiting the impact of garbage collection](/en/ch9#sec_distributed_gc_impact)
- atomic clocks, [Clock readings with a confidence interval](/en/ch9#clock-readings-with-a-confidence-interval), [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- confidence interval, [Clock readings with a confidence interval](/en/ch9#clock-readings-with-a-confidence-interval)-[Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- for global snapshots, [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- hybrid logical clocks, [Hybrid logical clocks](/en/ch10#hybrid-logical-clocks)
- logical (see logical clocks)
- skew, [Last write wins (discarding concurrent writes)](/en/ch6#sec_replication_lww), [Limitations of Quorum Consistency](/en/ch6#sec_replication_quorum_limitations), [Relying on Synchronized Clocks](/en/ch9#sec_distributed_clocks_relying)-[Clock readings with a confidence interval](/en/ch9#clock-readings-with-a-confidence-interval), [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable)
- slewing, [Monotonic clocks](/en/ch9#monotonic-clocks)
- synchronization and accuracy, [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy)-[Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy)
- synchronization using GPS, [Unreliable Clocks](/en/ch9#sec_distributed_clocks), [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy), [Clock readings with a confidence interval](/en/ch9#clock-readings-with-a-confidence-interval), [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- time-of-day versus monotonic clocks, [Monotonic Versus Time-of-Day Clocks](/en/ch9#sec_distributed_monotonic_timeofday)
- timestamping events, [Whose clock are you using, anyway?](/en/ch12#id438)
- cloud services, [Cloud Versus Self-Hosting](/en/ch1#sec_introduction_cloud)-[Cloud Computing Versus Supercomputing](/en/ch1#id17)
- availability zones, [Tolerating hardware faults through redundancy](/en/ch2#tolerating-hardware-faults-through-redundancy), [Reading Your Own Writes](/en/ch6#sec_replication_ryw)
- data warehouses, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- need for service discovery, [Service discovery](/en/ch10#service-discovery)
- network glitches, [Network Faults in Practice](/en/ch9#sec_distributed_network_faults)
- pros and cons, [Pros and Cons of Cloud Services](/en/ch1#sec_introduction_cloud_tradeoffs)-[Pros and Cons of Cloud Services](/en/ch1#sec_introduction_cloud_tradeoffs)
- quotas, [Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
- regions (see regions (geographic distribution))
- serverless, [Microservices and Serverless](/en/ch1#sec_introduction_microservices)
- shared resources, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- versus supercomputing, [Cloud Computing Versus Supercomputing](/en/ch1#id17)
- cloud-native, [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native)-[Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
- Cloudflare
- R2 (see R2 (object storage))
- clustered indexes, [Storing values within the index](/en/ch4#sec_storage_index_heap)
- clustering (record ordering), [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- CockroachDB (database)
- consensus-based replication, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- consistency model, [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- key-range sharding, [Sharding](/en/ch7#ch_sharding), [Sharding by Key Range](/en/ch7#sec_sharding_key_range)
- serializable transactions, [Serializable Snapshot Isolation (SSI)](/en/ch8#sec_transactions_ssi)
- sharded secondary indexes, [Global Secondary Indexes](/en/ch7#id167)
- transactions, [What Exactly Is a Transaction?](/en/ch8#sec_transactions_overview), [Database-internal Distributed Transactions](/en/ch8#sec_transactions_internal)
- use of model-checking, [Model checking and specification languages](/en/ch9#model-checking-and-specification-languages)
- code generation
- for query execution, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- with Protocol Buffers, [Protocol Buffers](/en/ch5#sec_encoding_protobuf)
- collaborative editing, [Real-time collaboration, offline-first, and local-first apps](/en/ch6#real-time-collaboration-offline-first-and-local-first-apps)
- column families (Bigtable), [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality), [Column Compression](/en/ch4#sec_storage_column_compression)
- column-oriented storage, [Column-Oriented Storage](/en/ch4#sec_storage_column)-[Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- column compression, [Column Compression](/en/ch4#sec_storage_column_compression)
- Parquet, [Column-Oriented Storage](/en/ch4#sec_storage_column), [Archival storage](/en/ch5#archival-storage)
- sort order in, [Sort Order in Column Storage](/en/ch4#sort-order-in-column-storage)-[Sort Order in Column Storage](/en/ch4#sort-order-in-column-storage)
- vectorized processing, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- versus wide-column model, [Column Compression](/en/ch4#sec_storage_column_compression)
- writing to, [Writing to Column-Oriented Storage](/en/ch4#writing-to-column-oriented-storage)
- comma-separated values (see CSV)
- command query responsibility segregation (CQRS), [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)-[Event Sourcing and CQRS](/en/ch3#sec_datamodels_events), [Deriving several views from the same event log](/en/ch12#sec_stream_deriving_views)
- commands (event sourcing), [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- commits (transactions), [Transactions](/en/ch8#ch_transactions)
- atomic commit, [Distributed Transactions](/en/ch8#sec_transactions_distributed)-[Exactly-once message processing revisited](/en/ch8#exactly-once-message-processing-revisited)
- (see also atomicity; transactions)
- read committed isolation, [Read Committed](/en/ch8#sec_transactions_read_committed)
- three-phase commit (3PC), [Three-phase commit](/en/ch8#three-phase-commit)
- two-phase commit (2PC), [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc)-[Coordinator failure](/en/ch8#coordinator-failure)
- commutative operations, [Conflict resolution and replication](/en/ch8#conflict-resolution-and-replication)
- compaction
- of changelogs, [Log compaction](/en/ch12#sec_stream_log_compaction)
- (see also log compaction)
- for stream operator state, [Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance)
- of log-structured storage, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- issues with, [Read performance](/en/ch4#read-performance)
- size-tiered and leveled approaches, [Compaction strategies](/en/ch4#sec_storage_lsm_compaction), [Disk space usage](/en/ch4#disk-space-usage)
- compare-and-set (CAS), [Conditional writes (compare-and-set)](/en/ch8#sec_transactions_compare_and_set), [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- implementing locks, [Coordination Services](/en/ch10#sec_consistency_coordination)
- implementing uniqueness constraints, [Constraints and uniqueness guarantees](/en/ch10#sec_consistency_uniqueness)
- on object storage, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- relation to consensus, [Linearizability and quorums](/en/ch10#sec_consistency_quorum_linearizable), [Consensus](/en/ch10#sec_consistency_consensus), [Compare-and-set as consensus](/en/ch10#compare-and-set-as-consensus)
- relation to fencing tokens, [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens)
- relation to transactions, [Single-object writes](/en/ch8#sec_transactions_single_object)
- compatibility, [Encoding and Evolution](/en/ch5#ch_encoding), [Modes of Dataflow](/en/ch5#sec_encoding_dataflow)
- calling services, [Data encoding and evolution for RPC](/en/ch5#data-encoding-and-evolution-for-rpc)
- properties of encoding formats, [Summary](/en/ch5#summary)
- using databases, [Dataflow Through Databases](/en/ch5#sec_encoding_dataflow_db)-[Archival storage](/en/ch5#archival-storage)
- compensating transactions, [Advantages of immutable events](/en/ch12#sec_stream_immutability_pros), [Loosely interpreted constraints](/en/ch13#id362)
- compilation, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- complex event processing (CEP), [Complex event processing](/en/ch12#id317)
- complexity
- distilling in theoretical models, [Mapping system models to the real world](/en/ch9#mapping-system-models-to-the-real-world)
- essential and accidental, [Simplicity: Managing Complexity](/en/ch2#id38)
- hiding using abstraction, [Data Models and Query Languages](/en/ch3#ch_datamodels)
- managing, [Simplicity: Managing Complexity](/en/ch2#id38)
- composing data systems (see unbundling databases)
- compression
- in SSTables, [The SSTable file format](/en/ch4#the-sstable-file-format)
- compute-intensive applications, [Trade-offs in Data Systems Architecture](/en/ch1#ch_tradeoffs)
- computer games, [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- concatenated indexes, [Multidimensional and Full-Text Indexes](/en/ch4#sec_storage_multidimensional)
- in hash-sharded systems, [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- concurrency
- actor programming model, [Distributed actor frameworks](/en/ch5#distributed-actor-frameworks), [Event-Driven Architectures and RPC](/en/ch12#sec_stream_actors_drpc)
- (see also event-driven architecture)
- bugs from weak transaction isolation, [Weak Isolation Levels](/en/ch8#sec_transactions_isolation_levels)
- conflict resolution, [Dealing with Conflicting Writes](/en/ch6#sec_replication_write_conflicts)-[Types of conflict](/en/ch6#sec_replication_write_conflicts)
- definition, [Dealing with Conflicting Writes](/en/ch6#sec_replication_write_conflicts)
- detecting concurrent writes, [Detecting Concurrent Writes](/en/ch6#sec_replication_concurrent)-[Version vectors](/en/ch6#version-vectors)
- dual writes, problems with, [Keeping Systems in Sync](/en/ch12#sec_stream_sync)
- happens-before relation, [The "happens-before" relation and concurrency](/en/ch6#sec_replication_happens_before)
- in replicated systems, [Problems with Replication Lag](/en/ch6#sec_replication_lag)-[Version vectors](/en/ch6#version-vectors), [Linearizability](/en/ch10#sec_consistency_linearizability)-[Linearizability and network delays](/en/ch10#linearizability-and-network-delays)
- lost updates, [Preventing Lost Updates](/en/ch8#sec_transactions_lost_update)
- multi-version concurrency control (MVCC), [Multi-version concurrency control (MVCC)](/en/ch8#sec_transactions_snapshot_impl), [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- optimistic concurrency control, [Pessimistic versus optimistic concurrency control](/en/ch8#pessimistic-versus-optimistic-concurrency-control)
- ordering of operations, [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- reducing, through event logs, [Concurrency control](/en/ch12#sec_stream_concurrency), [Dataflow: Interplay between state changes and application code](/en/ch13#id450)
- time and relativity, [The "happens-before" relation and concurrency](/en/ch6#sec_replication_happens_before)
- transaction isolation, [Isolation](/en/ch8#sec_transactions_acid_isolation)
- write skew (transaction isolation), [Write Skew and Phantoms](/en/ch8#sec_transactions_write_skew)-[Materializing conflicts](/en/ch8#materializing-conflicts)
- conditional write, [Conditional writes (compare-and-set)](/en/ch8#sec_transactions_compare_and_set)
- in transactions, [Single-object writes](/en/ch8#sec_transactions_single_object)
- on object storage, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- conference management system (example), [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- conflict-free replicated datatypes (CRDTs), [CRDTs and Operational Transformation](/en/ch6#sec_replication_crdts)
- for leaderless replication, [Capturing the happens-before relationship](/en/ch6#capturing-the-happens-before-relationship)
- preventing lost updates, [Conflict resolution and replication](/en/ch8#conflict-resolution-and-replication)
- conflicts
- avoidance, [Conflict avoidance](/en/ch6#conflict-avoidance)
- causal dependencies, [The "happens-before" relation and concurrency](/en/ch6#sec_replication_happens_before)
- conflict detection
- in distributed transactions, [Problems with XA transactions](/en/ch8#problems-with-xa-transactions)
- in log-based systems, [Uniqueness constraints require consensus](/en/ch13#id452)
- in serializable snapshot isolation (SSI), [Detecting writes that affect prior reads](/en/ch8#sec_detecting_writes_affect_reads)
- in two-phase commit, [A system of promises](/en/ch8#a-system-of-promises)
- conflict resolution
- by aborting transactions, [Pessimistic versus optimistic concurrency control](/en/ch8#pessimistic-versus-optimistic-concurrency-control)
- by apologizing, [Loosely interpreted constraints](/en/ch13#id362)
- last write wins (LWW), [Timestamps for ordering events](/en/ch9#sec_distributed_lww)
- using atomic operations, [Conflict resolution and replication](/en/ch8#conflict-resolution-and-replication)
- determining what is a conflict, [Types of conflict](/en/ch6#sec_replication_write_conflicts), [Uniqueness in log-based messaging](/en/ch13#sec_future_uniqueness_log)
- in leaderless replication, [Detecting Concurrent Writes](/en/ch6#sec_replication_concurrent)
- lost updates, [Preventing Lost Updates](/en/ch8#sec_transactions_lost_update)-[Conflict resolution and replication](/en/ch8#conflict-resolution-and-replication)
- materializing, [Materializing conflicts](/en/ch8#materializing-conflicts)
- resolution, [Dealing with Conflicting Writes](/en/ch6#sec_replication_write_conflicts)-[Types of conflict](/en/ch6#sec_replication_write_conflicts)
- automatic, [Automatic conflict resolution](/en/ch6#automatic-conflict-resolution)
- in leaderless systems, [Detecting Concurrent Writes](/en/ch6#sec_replication_concurrent)
- last write wins (LWW), [Last write wins (discarding concurrent writes)](/en/ch6#sec_replication_lww)
- using custom logic, [Manual conflict resolution](/en/ch6#manual-conflict-resolution), [Capturing the happens-before relationship](/en/ch6#capturing-the-happens-before-relationship)
- siblings, [Manual conflict resolution](/en/ch6#manual-conflict-resolution), [Capturing the happens-before relationship](/en/ch6#capturing-the-happens-before-relationship)
- merging, [Capturing the happens-before relationship](/en/ch6#capturing-the-happens-before-relationship)
- write skew (transaction isolation), [Write Skew and Phantoms](/en/ch8#sec_transactions_write_skew)-[Materializing conflicts](/en/ch8#materializing-conflicts)
- Confluent
- Freight (messaging), [Setting Up New Followers](/en/ch6#sec_replication_new_replica), [Disk space usage](/en/ch12#sec_stream_disk_usage)
- schema registry, [JSON Schema](/en/ch5#json-schema), [But what is the writer's schema?](/en/ch5#but-what-is-the-writers-schema)
- congestion (networks)
- avoidance, [The Limitations of TCP](/en/ch9#sec_distributed_tcp)
- limiting accuracy of clocks, [Clock readings with a confidence interval](/en/ch9#clock-readings-with-a-confidence-interval)
- queueing delays, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- consensus, [Consensus](/en/ch10#sec_consistency_consensus)-[Summary](/en/ch10#summary), [Glossary](/en/glossary)
- algorithms, [Consensus](/en/ch10#sec_consistency_consensus), [Consensus in Practice](/en/ch10#sec_consistency_total_order)
- consensus numbers, [Fetch-and-add as consensus](/en/ch10#fetch-and-add-as-consensus)
- coordination services, [Coordination Services](/en/ch10#sec_consistency_coordination)-[Service discovery](/en/ch10#service-discovery)
- cost of, [Pros and cons of consensus](/en/ch10#pros-and-cons-of-consensus)
- impossibility of, [Consensus](/en/ch10#sec_consistency_consensus)
- preventing split brain, [From single-leader replication to consensus](/en/ch10#from-single-leader-replication-to-consensus)
- reconfiguration, [Subtleties of consensus](/en/ch10#subtleties-of-consensus)
- relation to atomic commitment, [Atomic commitment as consensus](/en/ch10#atomic-commitment-as-consensus)
- relation to compare-and-set (CAS), [Linearizability and quorums](/en/ch10#sec_consistency_quorum_linearizable), [Compare-and-set as consensus](/en/ch10#compare-and-set-as-consensus)
- relation to fetch-and-add, [Fetch-and-add as consensus](/en/ch10#fetch-and-add-as-consensus)
- relation to replication, [Using shared logs](/en/ch10#sec_consistency_smr)
- relation to shared logs, [Shared logs as consensus](/en/ch10#sec_consistency_shared_logs)
- relation to uniqueness constraints, [Uniqueness constraints require consensus](/en/ch13#id452)
- safety and liveness properties, [Single-value consensus](/en/ch10#single-value-consensus)
- single-value consensus, [Single-value consensus](/en/ch10#single-value-consensus)
- consent (GDPR), [Consent and Freedom of Choice](/en/ch14#id375)
- consistency, [Consistency](/en/ch8#sec_transactions_acid_consistency), [Timeliness and Integrity](/en/ch13#sec_future_integrity)
- across different databases, [Leader failure: Failover](/en/ch6#leader-failure-failover), [Keeping Systems in Sync](/en/ch12#sec_stream_sync), [Deriving several views from the same event log](/en/ch12#sec_stream_deriving_views), [Derived data versus distributed transactions](/en/ch13#sec_future_derived_vs_transactions)
- causal, [Consistent Prefix Reads](/en/ch6#sec_replication_consistent_prefix), [Problems with different topologies](/en/ch6#problems-with-different-topologies), [Ordering events to capture causality](/en/ch13#sec_future_capture_causality)
- consistent prefix reads, [Consistent Prefix Reads](/en/ch6#sec_replication_consistent_prefix)-[Consistent Prefix Reads](/en/ch6#sec_replication_consistent_prefix)
- consistent snapshots, [Setting Up New Followers](/en/ch6#sec_replication_new_replica), [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)-[Snapshot isolation, repeatable read, and naming confusion](/en/ch8#snapshot-isolation-repeatable-read-and-naming-confusion), [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner), [Initial snapshot](/en/ch12#sec_stream_cdc_snapshot), [Creating an index](/en/ch13#id340)
- (see also snapshots)
- crash recovery, [Making B-trees reliable](/en/ch4#sec_storage_btree_wal)
- enforcing constraints (see constraints)
- eventual, [Problems with Replication Lag](/en/ch6#sec_replication_lag)
- (see also eventual consistency)
- in ACID transactions, [Consistency](/en/ch8#sec_transactions_acid_consistency), [Maintaining integrity in the face of software bugs](/en/ch13#id455)
- in CAP theorem, [The CAP theorem](/en/ch10#the-cap-theorem)
- in leader election, [Subtleties of consensus](/en/ch10#subtleties-of-consensus)
- in microservices, [Problems with Distributed Systems](/en/ch1#sec_introduction_dist_sys_problems)
- linearizability, [Solutions for Replication Lag](/en/ch6#id131), [Linearizability](/en/ch10#sec_consistency_linearizability)-[Linearizability and network delays](/en/ch10#linearizability-and-network-delays)
- meanings of, [Consistency](/en/ch8#sec_transactions_acid_consistency)
- monotonic reads, [Monotonic Reads](/en/ch6#sec_replication_monotonic_reads)-[Monotonic Reads](/en/ch6#sec_replication_monotonic_reads)
- of secondary indexes, [The need for multi-object transactions](/en/ch8#sec_transactions_need), [Indexes and snapshot isolation](/en/ch8#indexes-and-snapshot-isolation), [Reasoning about dataflows](/en/ch13#id443), [Creating an index](/en/ch13#id340)
- read-after-write, [Reading Your Own Writes](/en/ch6#sec_replication_ryw)-[Reading Your Own Writes](/en/ch6#sec_replication_ryw)
- in derived data systems, [Derived data versus distributed transactions](/en/ch13#sec_future_derived_vs_transactions)
- strong (see linearizability)
- timeliness and integrity, [Timeliness and Integrity](/en/ch13#sec_future_integrity)
- using quorums, [Limitations of Quorum Consistency](/en/ch6#sec_replication_quorum_limitations), [Linearizability and quorums](/en/ch10#sec_consistency_quorum_linearizable)
- consistent hashing, [Consistent hashing](/en/ch7#sec_sharding_consistent_hashing)
- consistent prefix reads, [Consistent Prefix Reads](/en/ch6#sec_replication_consistent_prefix)
- constraints (databases), [Consistency](/en/ch8#sec_transactions_acid_consistency), [Characterizing write skew](/en/ch8#characterizing-write-skew)
- asynchronously checked, [Loosely interpreted constraints](/en/ch13#id362)
- coordination avoidance, [Coordination-avoiding data systems](/en/ch13#id454)
- ensuring idempotence, [Uniquely identifying requests](/en/ch13#id355)
- in log-based systems, [Enforcing Constraints](/en/ch13#sec_future_constraints)-[Multi-shard request processing](/en/ch13#id360)
- across multiple shards, [Multi-shard request processing](/en/ch13#id360)
- in two-phase commit, [Distributed Transactions](/en/ch8#sec_transactions_distributed), [A system of promises](/en/ch8#a-system-of-promises)
- relation to consensus, [Uniqueness constraints require consensus](/en/ch13#id452)
- requiring linearizability, [Constraints and uniqueness guarantees](/en/ch10#sec_consistency_uniqueness)
- Consul (coordination service), [Coordination Services](/en/ch10#sec_consistency_coordination)
- use for service discovery, [Service discovery](/en/ch10#service-discovery)
- consumers (message streams), [Message brokers](/en/ch5#message-brokers), [Transmitting Event Streams](/en/ch12#sec_stream_transmit)
- backpressure, [Messaging Systems](/en/ch12#sec_stream_messaging)
- consumer groups, [Multiple consumers](/en/ch12#id298)
- consumer offsets in logs, [Consumer offsets](/en/ch12#sec_stream_log_offsets)
- failures, [Acknowledgments and redelivery](/en/ch12#sec_stream_reordering), [Consumer offsets](/en/ch12#sec_stream_log_offsets)
- fan-out, [Materializing and Updating Timelines](/en/ch2#sec_introduction_materializing), [Multiple consumers](/en/ch12#id298), [Logs compared to traditional messaging](/en/ch12#sec_stream_logs_vs_messaging)
- load balancing, [Multiple consumers](/en/ch12#id298), [Logs compared to traditional messaging](/en/ch12#sec_stream_logs_vs_messaging)
- not keeping up with producers, [Messaging Systems](/en/ch12#sec_stream_messaging), [Disk space usage](/en/ch12#sec_stream_disk_usage), [Making unbundling work](/en/ch13#sec_future_unbundling_favor)
- content models (JSON Schema), [JSON Schema](/en/ch5#json-schema)
- contention
- between transactions, [Handling errors and aborts](/en/ch8#handling-errors-and-aborts)
- blocking threads, [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- performance of optimistic concurrency control, [Pessimistic versus optimistic concurrency control](/en/ch8#pessimistic-versus-optimistic-concurrency-control)
- under two-phase locking, [Performance of two-phase locking](/en/ch8#performance-of-two-phase-locking)
- context switches, [Latency and Response Time](/en/ch2#id23), [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- convergence (conflict resolution), [Automatic conflict resolution](/en/ch6#automatic-conflict-resolution)-[CRDTs and Operational Transformation](/en/ch6#sec_replication_crdts)
- coordination
- avoidance, [Coordination-avoiding data systems](/en/ch13#id454)
- cross-datacenter, [The limits of total ordering](/en/ch13#id335)
- cross-region, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)
- cross-shard ordering, [Sharding](/en/ch8#sharding), [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner), [Using shared logs](/en/ch10#sec_consistency_smr), [Multi-shard request processing](/en/ch13#id360)
- routing requests to shards, [Request Routing](/en/ch7#sec_sharding_routing)
- services, [Locking and leader election](/en/ch10#locking-and-leader-election), [Coordination Services](/en/ch10#sec_consistency_coordination)-[Service discovery](/en/ch10#service-discovery)
- coordinator (in 2PC), [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc)
- failure, [Coordinator failure](/en/ch8#coordinator-failure)
- in XA transactions, [XA transactions](/en/ch8#xa-transactions)-[Problems with XA transactions](/en/ch8#problems-with-xa-transactions)
- recovery, [Recovering from coordinator failure](/en/ch8#recovering-from-coordinator-failure)
- copy-on-write (B-trees), [B-tree variants](/en/ch4#b-tree-variants), [Indexes and snapshot isolation](/en/ch8#indexes-and-snapshot-isolation)
- CORBA (Common Object Request Broker Architecture), [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc)
- coronal mass ejection (see solar storm)
- correctness
- auditability, [Trust, but Verify](/en/ch13#sec_future_verification)-[Tools for auditable data systems](/en/ch13#id366)
- Byzantine fault tolerance, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)
- dealing with partial failures, [Faults and Partial Failures](/en/ch9#sec_distributed_partial_failure)
- in log-based systems, [Enforcing Constraints](/en/ch13#sec_future_constraints)-[Multi-shard request processing](/en/ch13#id360)
- of algorithm within system model, [Defining the correctness of an algorithm](/en/ch9#defining-the-correctness-of-an-algorithm)
- of derived data, [Designing for auditability](/en/ch13#id365)
- of immutable data, [Advantages of immutable events](/en/ch12#sec_stream_immutability_pros)
- of personal data, [Responsibility and Accountability](/en/ch14#id371), [Privacy and Use of Data](/en/ch14#id457)
- of time, [Problems with different topologies](/en/ch6#problems-with-different-topologies), [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy)-[Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- of transactions, [Consistency](/en/ch8#sec_transactions_acid_consistency), [Aiming for Correctness](/en/ch13#sec_future_correctness), [Maintaining integrity in the face of software bugs](/en/ch13#id455)
- timeliness and integrity, [Timeliness and Integrity](/en/ch13#sec_future_integrity)-[Coordination-avoiding data systems](/en/ch13#id454)
- corruption of data
- detecting, [The end-to-end argument](/en/ch13#sec_future_e2e_argument), [Don't just blindly trust what they promise](/en/ch13#id364)-[Tools for auditable data systems](/en/ch13#id366)
- due to pathological memory access, [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults)
- due to radiation, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)
- due to split brain, [Leader failure: Failover](/en/ch6#leader-failure-failover), [Distributed Locks and Leases](/en/ch9#sec_distributed_lock_fencing)
- due to weak transaction isolation, [Weak Isolation Levels](/en/ch8#sec_transactions_isolation_levels)
- integrity as absence of, [Timeliness and Integrity](/en/ch13#sec_future_integrity)
- network packets, [Weak forms of lying](/en/ch9#weak-forms-of-lying)
- on disks, [Durability](/en/ch8#durability)
- preventing using write-ahead logs, [Making B-trees reliable](/en/ch4#sec_storage_btree_wal)
- recovering from, [Batch Processing](/en/ch11#ch_batch), [Advantages of immutable events](/en/ch12#sec_stream_immutability_pros)
- cosine similarity (semantic search), [Vector Embeddings](/en/ch4#id92)
- Couchbase (database)
- document data model, [Relational Model versus Document Model](/en/ch3#sec_datamodels_history)
- durability, [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- hash sharding, [Fixed number of shards](/en/ch7#fixed-number-of-shards)
- join support, [Convergence of document and relational databases](/en/ch3#convergence-of-document-and-relational-databases)
- rebalancing, [Operations: Automatic or Manual Rebalancing](/en/ch7#sec_sharding_operations)
- vBuckets (sharding), [Sharding](/en/ch7#ch_sharding)
- CouchDB (database)
- as sync engine, [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- B-tree storage, [Indexes and snapshot isolation](/en/ch8#indexes-and-snapshot-isolation)
- conflict resolution, [Manual conflict resolution](/en/ch6#manual-conflict-resolution)
- coupling (loose and tight), [Evolvability: Making Change Easy](/en/ch2#sec_introduction_evolvability)
- covering indexes, [Storing values within the index](/en/ch4#sec_storage_index_heap)
- CozoDB (database), [Datalog: Recursive Relational Queries](/en/ch3#id62)
- CPUs
- cache coherence and memory barriers, [Linearizability and network delays](/en/ch10#linearizability-and-network-delays)
- caching and pipelining, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- computing the wrong result, [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults)
- SIMD instructions, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- crash-stop and crash-recovery faults, [System Model and Reality](/en/ch9#sec_distributed_system_model)
- CRDTs (see conflict-free replicated datatypes)
- CREATE INDEX statement (SQL), [Multi-Column and Secondary Indexes](/en/ch4#sec_storage_index_multicolumn), [Creating an index](/en/ch13#id340)
- credit rating agencies, [Responsibility and Accountability](/en/ch14#id371)
- crypto-shredding, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events), [Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- cryptocurrencies, [Summary](/en/ch3#summary)
- cryptography
- defense against attackers, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)
- end-to-end encryption and authentication, [The end-to-end argument](/en/ch13#sec_future_e2e_argument)
- CSV (comma-separated values), [Storage and Indexing for OLTP](/en/ch4#sec_storage_oltp), [JSON, XML, and Binary Variants](/en/ch5#sec_encoding_json)
- Curator (ZooKeeper recipes), [Locking and leader election](/en/ch10#locking-and-leader-election), [Allocating work to nodes](/en/ch10#allocating-work-to-nodes)
- Cypher (query language), [The Cypher Query Language](/en/ch3#id57)
- comparison to SPARQL, [The SPARQL query language](/en/ch3#the-sparql-query-language)
### D
- Daft (processing framework)
- DataFrames, [DataFrames](/en/ch11#id287)
- shuffling data, [Shuffling Data](/en/ch11#sec_shuffle)
- Dagster (workflow scheduler), [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows), [Batch Processing](/en/ch11#ch_batch), [Scheduling Workflows](/en/ch11#sec_batch_workflows)
- cloud data warehouse integration, [Query languages](/en/ch11#sec_batch_query_lanauges)
- dashboard (business intelligence), [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp)
- Dask (processing framework), [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- data catalog, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- data connectors, [Data Warehousing](/en/ch1#sec_introduction_dwh)
- data contracts, [Extract--Transform--Load (ETL)](/en/ch11#sec_batch_etl_usage)
- change data capture, [Change data capture versus event sourcing](/en/ch12#sec_stream_event_sourcing)
- data corruption (see corruption of data)
- data cubes, [Materialized Views and Data Cubes](/en/ch4#sec_storage_materialized_views)
- data engineering, [Operational Versus Analytical Systems](/en/ch1#sec_introduction_analytics)
- data fabric, [Extract--Transform--Load (ETL)](/en/ch11#sec_batch_etl_usage)
- data formats (see encoding)
- data infrastructure, [Trade-offs in Data Systems Architecture](/en/ch1#ch_tradeoffs)
- data integration, [Data Integration](/en/ch13#sec_future_integration)-[Unifying batch and stream processing](/en/ch13#id338), [Summary](/en/ch13#id367)
- batch and stream processing, [Batch and Stream Processing](/en/ch13#sec_future_batch_streaming)-[Unifying batch and stream processing](/en/ch13#id338)
- maintaining derived state, [Maintaining derived state](/en/ch13#id446)
- reprocessing data, [Reprocessing data for application evolution](/en/ch13#sec_future_reprocessing)
- unifying, [Unifying batch and stream processing](/en/ch13#id338)
- by unbundling databases, [Unbundling Databases](/en/ch13#sec_future_unbundling)-[Multi-shard data processing](/en/ch13#sec_future_unbundled_multi_shard)
- comparison to federated databases, [The meta-database of everything](/en/ch13#id341)
- combining tools by deriving data, [Combining Specialized Tools by Deriving Data](/en/ch13#id442)-[Ordering events to capture causality](/en/ch13#sec_future_capture_causality)
- derived data versus distributed transactions, [Derived data versus distributed transactions](/en/ch13#sec_future_derived_vs_transactions)
- limits of total ordering, [The limits of total ordering](/en/ch13#id335)
- ordering events to capture causality, [Ordering events to capture causality](/en/ch13#sec_future_capture_causality)
- reasoning about dataflows, [Reasoning about dataflows](/en/ch13#id443)
- need for, [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived)
- using batch processing, [Batch Processing](/en/ch11#ch_batch), [Extract--Transform--Load (ETL)](/en/ch11#sec_batch_etl_usage)
- data lake, [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake)
- data lakehouse, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses), [Analytics](/en/ch11#sec_batch_olap)
- data locality (see locality)
- data mesh, [Extract--Transform--Load (ETL)](/en/ch11#sec_batch_etl_usage)
- data minimization, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance), [Legislation and Self-Regulation](/en/ch14#sec_future_legislation)
- data models, [Data Models and Query Languages](/en/ch3#ch_datamodels)-[Summary](/en/ch3#summary)
- DataFrames and arrays, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- graph-like models, [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)-[GraphQL](/en/ch3#id63)
- Datalog language, [Datalog: Recursive Relational Queries](/en/ch3#id62)-[Datalog: Recursive Relational Queries](/en/ch3#id62)
- property graphs, [Property Graphs](/en/ch3#id56)
- RDF and triple-stores, [Triple-Stores and SPARQL](/en/ch3#id59)-[The SPARQL query language](/en/ch3#the-sparql-query-language)
- relational model versus document model, [Relational Model versus Document Model](/en/ch3#sec_datamodels_history)-[Convergence of document and relational databases](/en/ch3#convergence-of-document-and-relational-databases)
- supporting multiple, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- data pipelines, [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake), [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived), [Extract--Transform--Load (ETL)](/en/ch11#sec_batch_etl_usage)
- data products, [Beyond the data lake](/en/ch1#beyond-the-data-lake)
- data protection regulations (see GDPR)
- data residence laws, [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed), [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- data science, [Operational Versus Analytical Systems](/en/ch1#sec_introduction_analytics), [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake)
- data silo, [Data Warehousing](/en/ch1#sec_introduction_dwh)
- data systems
- correctness, constraints, and integrity, [Aiming for Correctness](/en/ch13#sec_future_correctness)-[Tools for auditable data systems](/en/ch13#id366)
- data integration, [Data Integration](/en/ch13#sec_future_integration)-[Unifying batch and stream processing](/en/ch13#id338)
- goals for using, [Trade-offs in Data Systems Architecture](/en/ch1#ch_tradeoffs)
- heterogeneous, keeping in sync, [Keeping Systems in Sync](/en/ch12#sec_stream_sync)
- maintainability, [Maintainability](/en/ch2#sec_introduction_maintainability)-[Evolvability: Making Change Easy](/en/ch2#sec_introduction_evolvability)
- possible faults in, [Transactions](/en/ch8#ch_transactions)
- reliability, [Reliability and Fault Tolerance](/en/ch2#sec_introduction_reliability)-[Humans and Reliability](/en/ch2#id31)
- hardware faults, [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults)
- human errors, [Humans and Reliability](/en/ch2#id31)
- importance of, [Humans and Reliability](/en/ch2#id31)
- software faults, [Software faults](/en/ch2#software-faults)
- scalability, [Scalability](/en/ch2#sec_introduction_scalability)-[Principles for Scalability](/en/ch2#id35)
- unbundling databases, [Unbundling Databases](/en/ch13#sec_future_unbundling)-[Multi-shard data processing](/en/ch13#sec_future_unbundled_multi_shard)
- unreliable clocks, [Unreliable Clocks](/en/ch9#sec_distributed_clocks)-[Limiting the impact of garbage collection](/en/ch9#sec_distributed_gc_impact)
- data warehousing, [Data Warehousing](/en/ch1#sec_introduction_dwh), [Glossary](/en/glossary)
- cloud-based solutions, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- ETL (extract-transform-load), [Data Warehousing](/en/ch1#sec_introduction_dwh), [Keeping Systems in Sync](/en/ch12#sec_stream_sync)
- for batch processing, [Batch Processing](/en/ch11#ch_batch)
- keeping data systems in sync, [Keeping Systems in Sync](/en/ch12#sec_stream_sync)
- schema design, [Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)
- sharding and clustering, [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- slowly changing dimension (SCD), [Time-dependence of joins](/en/ch12#sec_stream_join_time)
- data-intensive applications, [Trade-offs in Data Systems Architecture](/en/ch1#ch_tradeoffs)
- database administrator, [Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
- database-internal distributed transactions, [Distributed Transactions Across Different Systems](/en/ch8#sec_transactions_xa), [Database-internal Distributed Transactions](/en/ch8#sec_transactions_internal), [Atomic commit revisited](/en/ch12#sec_stream_atomic_commit)
- databases
- archival storage, [Archival storage](/en/ch5#archival-storage)
- comparison of message brokers to, [Message brokers compared to databases](/en/ch12#id297)
- dataflow through, [Dataflow Through Databases](/en/ch5#sec_encoding_dataflow_db)
- end-to-end argument for, [The end-to-end argument](/en/ch13#sec_future_e2e_argument)-[Applying end-to-end thinking in data systems](/en/ch13#id357)
- checking integrity, [The end-to-end argument again](/en/ch13#id456)
- relation to event streams, [Databases and Streams](/en/ch12#sec_stream_databases)-[Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- (see also changelogs)
- API support for change streams, [API support for change streams](/en/ch12#sec_stream_change_api), [Separation of application code and state](/en/ch13#id344)
- change data capture, [Change Data Capture](/en/ch12#sec_stream_cdc)-[API support for change streams](/en/ch12#sec_stream_change_api)
- event sourcing, [Change data capture versus event sourcing](/en/ch12#sec_stream_event_sourcing)
- keeping systems in sync, [Keeping Systems in Sync](/en/ch12#sec_stream_sync)-[Keeping Systems in Sync](/en/ch12#sec_stream_sync)
- philosophy of immutable events, [State, Streams, and Immutability](/en/ch12#sec_stream_immutability)-[Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- unbundling, [Unbundling Databases](/en/ch13#sec_future_unbundling)-[Multi-shard data processing](/en/ch13#sec_future_unbundled_multi_shard)
- composing data storage technologies, [Composing Data Storage Technologies](/en/ch13#id447)-[Unbundled versus integrated systems](/en/ch13#id448)
- designing applications around dataflow, [Designing Applications Around Dataflow](/en/ch13#sec_future_dataflow)-[Stream processors and services](/en/ch13#id345)
- observing derived state, [Observing Derived State](/en/ch13#sec_future_observing)-[Multi-shard data processing](/en/ch13#sec_future_unbundled_multi_shard)
- datacenters
- failures of, [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults)
- geographically distributed (see regions (geographic distribution))
- multitenancy and shared resources, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- network architecture, [Cloud Computing Versus Supercomputing](/en/ch1#id17)
- network faults, [Network Faults in Practice](/en/ch9#sec_distributed_network_faults)
- dataflow, [Modes of Dataflow](/en/ch5#sec_encoding_dataflow)-[Distributed actor frameworks](/en/ch5#distributed-actor-frameworks), [Designing Applications Around Dataflow](/en/ch13#sec_future_dataflow)-[Stream processors and services](/en/ch13#id345)
- correctness of dataflow systems, [Correctness of dataflow systems](/en/ch13#id453)
- dataflow engines, [Dataflow Engines](/en/ch11#sec_batch_dataflow)
- comparison to stream processing, [Processing Streams](/en/ch12#sec_stream_processing)
- DataFrames, [DataFrames](/en/ch11#id287)
- support in batch processing frameworks, [Batch Processing](/en/ch11#ch_batch)
- event-driven, [Event-Driven Architectures](/en/ch5#sec_encoding_dataflow_msg)-[Distributed actor frameworks](/en/ch5#distributed-actor-frameworks)
- reasoning about, [Reasoning about dataflows](/en/ch13#id443)
- through databases, [Dataflow Through Databases](/en/ch5#sec_encoding_dataflow_db)
- through services, [Dataflow Through Services: REST and RPC](/en/ch5#sec_encoding_dataflow_rpc)-[Data encoding and evolution for RPC](/en/ch5#data-encoding-and-evolution-for-rpc)
- workflow engines (see workflow engines)
- DataFrames, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- implementation, [DataFrames](/en/ch11#id287)
- in batch processing, [DataFrames](/en/ch11#id287)
- in notebooks, [Machine Learning](/en/ch11#id290)
- support in batch processing frameworks, [Batch Processing](/en/ch11#ch_batch)
- DataFusion (query engine), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- Datalog (query language), [Datalog: Recursive Relational Queries](/en/ch3#id62)-[Datalog: Recursive Relational Queries](/en/ch3#id62)
- Datastream (change data capture), [API support for change streams](/en/ch12#sec_stream_change_api)
- datatypes
- binary strings in XML and JSON, [JSON, XML, and Binary Variants](/en/ch5#sec_encoding_json)
- conflict-free, [CRDTs and Operational Transformation](/en/ch6#sec_replication_crdts)
- in Avro encodings, [Avro](/en/ch5#sec_encoding_avro)
- in Protocol Buffers, [Field tags and schema evolution](/en/ch5#field-tags-and-schema-evolution)
- numbers in XML and JSON, [JSON, XML, and Binary Variants](/en/ch5#sec_encoding_json)
- Datensparsamkeit, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance)
- Datomic (database)
- B-tree storage, [Indexes and snapshot isolation](/en/ch8#indexes-and-snapshot-isolation)
- data model, [Graph-Like Data Models](/en/ch3#sec_datamodels_graph), [Triple-Stores and SPARQL](/en/ch3#id59)
- Datalog query language, [Datalog: Recursive Relational Queries](/en/ch3#id62)
- excision (deleting data), [Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- languages for transactions, [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs)
- serial execution of transactions, [Actual Serial Execution](/en/ch8#sec_transactions_serial)
- Daylight Saving Time (DST), [Time-of-day clocks](/en/ch9#time-of-day-clocks)
- Db2 (database)
- change data capture, [Implementing change data capture](/en/ch12#id307)
- DBA (database administrator), [Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
- deadlocks, [Explicit locking](/en/ch8#explicit-locking)
- detection, in distributed transaction, [Problems with XA transactions](/en/ch8#problems-with-xa-transactions)
- in two-phase locking (2PL), [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- Debezium (change data capture), [Implementing change data capture](/en/ch12#id307)
- Cassandra, [API support for change streams](/en/ch12#sec_stream_change_api)
- for data integration, [Unbundled versus integrated systems](/en/ch13#id448)
- declarative languages, [Data Models and Query Languages](/en/ch3#ch_datamodels), [Glossary](/en/glossary)
- and sync engines, [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- Datalog, [Datalog: Recursive Relational Queries](/en/ch3#id62)
- in document databases, [Convergence of document and relational databases](/en/ch3#convergence-of-document-and-relational-databases)
- recursive SQL queries, [Graph Queries in SQL](/en/ch3#id58)
- SPARQL, [The SPARQL query language](/en/ch3#the-sparql-query-language)
- DeepSeek
- 3FS (see 3FS)
- delays
- bounded network delays, [Synchronous Versus Asynchronous Networks](/en/ch9#sec_distributed_sync_networks)
- bounded process pauses, [Response time guarantees](/en/ch9#sec_distributed_clocks_realtime)
- unbounded network delays, [Timeouts and Unbounded Delays](/en/ch9#sec_distributed_queueing)
- unbounded process pauses, [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- deleting data, [Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- in LSM storage, [Disk space usage](/en/ch4#disk-space-usage)
- legal basis, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance)
- Delta Lake (table format), [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- sharding and clustering, [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- demilitarized zone (networking), [Serving Derived Data](/en/ch11#sec_batch_serving_derived)
- denormalization (data representation), [Normalization, Denormalization, and Joins](/en/ch3#sec_datamodels_normalization)-[Many-to-One and Many-to-Many Relationships](/en/ch3#sec_datamodels_many_to_many), [Glossary](/en/glossary)
- in derived data systems, [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived)
- in event sourcing/CQRS, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- in social network case study, [Denormalization in the social networking case study](/en/ch3#denormalization-in-the-social-networking-case-study)
- materialized views, [Materialized Views and Data Cubes](/en/ch4#sec_storage_materialized_views)
- updating derived data, [Single-Object and Multi-Object Operations](/en/ch8#sec_transactions_multi_object), [The need for multi-object transactions](/en/ch8#sec_transactions_need), [Combining Specialized Tools by Deriving Data](/en/ch13#id442)
- versus normalization, [Deriving several views from the same event log](/en/ch12#sec_stream_deriving_views)
- derived data, [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived), [Stream Processing](/en/ch12#ch_stream), [Glossary](/en/glossary)
- batch processing, [Batch Processing](/en/ch11#ch_batch)
- event sourcing and CQRS, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- from change data capture, [Implementing change data capture](/en/ch12#id307)
- maintaining derived state through logs, [Databases and Streams](/en/ch12#sec_stream_databases)-[API support for change streams](/en/ch12#sec_stream_change_api), [State, Streams, and Immutability](/en/ch12#sec_stream_immutability)-[Concurrency control](/en/ch12#sec_stream_concurrency)
- observing, by subscribing to streams, [End-to-end event streams](/en/ch13#id349)
- outputs of batch and stream processing, [Batch and Stream Processing](/en/ch13#sec_future_batch_streaming)
- through application code, [Application code as a derivation function](/en/ch13#sec_future_dataflow_derivation)
- versus distributed transactions, [Derived data versus distributed transactions](/en/ch13#sec_future_derived_vs_transactions)
- design patterns, [Simplicity: Managing Complexity](/en/ch2#id38)
- deterministic operations, [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs), [Faults and Partial Failures](/en/ch9#sec_distributed_partial_failure), [Glossary](/en/glossary)
- and idempotence, [Idempotence](/en/ch12#sec_stream_idempotence), [Reasoning about dataflows](/en/ch13#id443)
- computing derived data, [Maintaining derived state](/en/ch13#id446), [Correctness of dataflow systems](/en/ch13#id453), [Designing for auditability](/en/ch13#id365)
- in event sourcing, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- in state machine replication, [Using shared logs](/en/ch10#sec_consistency_smr), [Databases and Streams](/en/ch12#sec_stream_databases)
- in statement-based replication, [Statement-based replication](/en/ch6#statement-based-replication)
- in testing, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- joins, [Time-dependence of joins](/en/ch12#sec_stream_join_time)
- making code deterministic, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- overview, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- deterministic simulation testing (DST), [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- DevOps, [Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
- dimension tables, [Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)
- dimensional modeling (see star schemas)
- directed acyclic graphs (DAG)
- workflows, [Scheduling Workflows](/en/ch11#sec_batch_workflows)
- (see also workflow engines)
- dirty reads (transaction isolation), [No dirty reads](/en/ch8#no-dirty-reads)
- dirty writes (transaction isolation), [No dirty writes](/en/ch8#sec_transactions_dirty_write)
- disaggregation
- of storage and compute, [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute)
- Discord (group chat)
- GraphQL example, [GraphQL](/en/ch3#id63)
- discrimination, [Bias and Discrimination](/en/ch14#id370)
- disks (see hard disks)
- distributed actor frameworks, [Distributed actor frameworks](/en/ch5#distributed-actor-frameworks)
- distributed filesystems, [Distributed Filesystems](/en/ch11#sec_batch_dfs)-[Distributed Filesystems](/en/ch11#sec_batch_dfs)
- comparison to object storage, [Object Stores](/en/ch11#id277)
- use by Flink, [Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance)
- distributed ledgers, [Summary](/en/ch3#summary)
- distributed systems, [The Trouble with Distributed Systems](/en/ch9#ch_distributed)-[Summary](/en/ch9#summary), [Glossary](/en/glossary)
- Byzantine faults, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)-[Weak forms of lying](/en/ch9#weak-forms-of-lying)
- detecting network faults, [Detecting Faults](/en/ch9#id307)
- faults and partial failures, [Faults and Partial Failures](/en/ch9#sec_distributed_partial_failure)
- formalization of consensus, [Single-value consensus](/en/ch10#single-value-consensus)
- impossibility results, [The CAP theorem](/en/ch10#the-cap-theorem), [Consensus](/en/ch10#sec_consistency_consensus)
- issues with failover, [Leader failure: Failover](/en/ch6#leader-failure-failover)
- multi-region (see regions (geographic distribution))
- network problems, [Unreliable Networks](/en/ch9#sec_distributed_networks)-[Can we not simply make network delays predictable?](/en/ch9#can-we-not-simply-make-network-delays-predictable)
- problems with, [Problems with Distributed Systems](/en/ch1#sec_introduction_dist_sys_problems)
- quorums, relying on, [The Majority Rules](/en/ch9#sec_distributed_majority)
- reasons for using, [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed), [Replication](/en/ch6#ch_replication)
- synchronized clocks, relying on, [Relying on Synchronized Clocks](/en/ch9#sec_distributed_clocks_relying)-[Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- system models, [System Model and Reality](/en/ch9#sec_distributed_system_model)-[Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- use of clocks and time, [Unreliable Clocks](/en/ch9#sec_distributed_clocks)
- distributed transactions (see transactions)
- Django (web framework), [Handling errors and aborts](/en/ch8#handling-errors-and-aborts)
- DMZ (demilitarized zone), [Serving Derived Data](/en/ch11#sec_batch_serving_derived)
- DNS (Domain Name System), [Request Routing](/en/ch7#sec_sharding_routing), [Service discovery](/en/ch10#service-discovery)
- for load balancing, [Load balancers, service discovery, and service meshes](/en/ch5#sec_encoding_service_discovery)
- Docker (container manager), [Separation of application code and state](/en/ch13#id344)
- document data model, [Relational Model versus Document Model](/en/ch3#sec_datamodels_history)-[Convergence of document and relational databases](/en/ch3#convergence-of-document-and-relational-databases)
- comparison to relational model, [When to Use Which Model](/en/ch3#sec_datamodels_document_summary)-[Convergence of document and relational databases](/en/ch3#convergence-of-document-and-relational-databases)
- multi-object transactions, need for, [The need for multi-object transactions](/en/ch8#sec_transactions_need)
- sharded secondary indexes, [Sharding and Secondary Indexes](/en/ch7#sec_sharding_secondary_indexes)
- versus relational model
- convergence of models, [Convergence of document and relational databases](/en/ch3#convergence-of-document-and-relational-databases)
- data locality, [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality)
- document-partitioned indexes (see local secondary indexes)
- domain-driven design (DDD), [Simplicity: Managing Complexity](/en/ch2#id38), [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- dotted version vectors, [Version vectors](/en/ch6#version-vectors)
- double-entry bookkeeping, [Summary](/en/ch3#summary)
- DRBD (Distributed Replicated Block Device), [Single-Leader Replication](/en/ch6#sec_replication_leader)
- drift (clocks), [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy)
- Druid (database), [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp), [Column-Oriented Storage](/en/ch4#sec_storage_column), [Deriving several views from the same event log](/en/ch12#sec_stream_deriving_views)
- handling writes, [Writing to Column-Oriented Storage](/en/ch4#writing-to-column-oriented-storage)
- pre-aggregation, [Analytics](/en/ch11#sec_batch_olap)
- serving derived data, [Serving Derived Data](/en/ch11#sec_batch_serving_derived)
- Dryad (dataflow engine), [Dataflow Engines](/en/ch11#sec_batch_dataflow)
- dual writes, problems with, [Keeping Systems in Sync](/en/ch12#sec_stream_sync)
- DuckDB (database), [Problems with Distributed Systems](/en/ch1#sec_introduction_dist_sys_problems), [Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- column-oriented storage, [Column-Oriented Storage](/en/ch4#sec_storage_column)
- use for ETL, [Extract--Transform--Load (ETL)](/en/ch11#sec_batch_etl_usage)
- duplicates, suppression of, [Duplicate suppression](/en/ch13#id354)
- (see also idempotence)
- using a unique ID, [Uniquely identifying requests](/en/ch13#id355), [Multi-shard request processing](/en/ch13#id360)
- durability (transactions), [Making B-trees reliable](/en/ch4#sec_storage_btree_wal), [Durability](/en/ch8#durability), [Glossary](/en/glossary)
- durable execution, [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows)
- reliance on determinism, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- Restate (see Restate (workflow engine))
- Temporal (see Temporal (workflow engine))
- durable functions (see workflow engines)
- duration (time), [Unreliable Clocks](/en/ch9#sec_distributed_clocks)
- measurement with monotonic clocks, [Monotonic clocks](/en/ch9#monotonic-clocks)
- dynamically typed languages
- analogy to schema-on-read, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility)
- Dynamo (database), [Leaderless Replication](/en/ch6#sec_replication_leaderless)
- Dynamo-style databases (see leaderless replication)
- DynamoDB (database)
- auto-scaling, [Operations: Automatic or Manual Rebalancing](/en/ch7#sec_sharding_operations)
- hash-range sharding, [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- leader-based replication, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- sharded secondary indexes, [Global Secondary Indexes](/en/ch7#id167)
### E
- EBS (virtual block device), [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute)
- compared to object storage, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- ECC (see error-correcting codes)
- EDB Postgres Distributed (database), [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)
- edges (in graphs), [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- property graph model, [Property Graphs](/en/ch3#id56)
- edit distance (full-text search), [Full-Text Search](/en/ch4#sec_storage_full_text)
- effectively-once semantics, [Fault Tolerance](/en/ch12#sec_stream_fault_tolerance), [Exactly-once execution of an operation](/en/ch13#id353)
- (see also exactly-once semantics)
- preservation of integrity, [Correctness of dataflow systems](/en/ch13#id453)
- Elastic Compute Cloud (EC2)
- spot instances, [Handling Faults](/en/ch11#id281)
- elasticity, [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed)
- cloud data warehouses, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses), [Query languages](/en/ch11#sec_batch_query_lanauges)
- Elasticsearch (search server)
- local secondary indexes, [Local Secondary Indexes](/en/ch7#id166)
- percolator (stream search), [Search on streams](/en/ch12#id320)
- serving derived data, [Serving Derived Data](/en/ch11#sec_batch_serving_derived)
- shard rebalancing, [Fixed number of shards](/en/ch7#fixed-number-of-shards)
- use of Lucene, [Full-Text Search](/en/ch4#sec_storage_full_text)
- Elm (programming language), [End-to-end event streams](/en/ch13#id349)
- ELT (extract-load-transform), [Data Warehousing](/en/ch1#sec_introduction_dwh)
- relation to batch processing, [Extract--Transform--Load (ETL)](/en/ch11#sec_batch_etl_usage)
- embarassingly parallel (algorithms)
- ETL (see ETL (extract-transform-load))
- MapReduce, [MapReduce](/en/ch11#sec_batch_mapreduce)
- (see also MapReduce)
- embedded storage engines, [Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- embedding (vector), [Vector Embeddings](/en/ch4#id92)
- encodings (data formats), [Encoding and Evolution](/en/ch5#ch_encoding)-[The Merits of Schemas](/en/ch5#sec_encoding_schemas)
- Avro, [Avro](/en/ch5#sec_encoding_avro)-[Dynamically generated schemas](/en/ch5#dynamically-generated-schemas)
- binary variants of JSON and XML, [Binary encoding](/en/ch5#binary-encoding)
- compatibility, [Encoding and Evolution](/en/ch5#ch_encoding)
- calling services, [Data encoding and evolution for RPC](/en/ch5#data-encoding-and-evolution-for-rpc)
- using databases, [Dataflow Through Databases](/en/ch5#sec_encoding_dataflow_db)-[Archival storage](/en/ch5#archival-storage)
- defined, [Formats for Encoding Data](/en/ch5#sec_encoding_formats)
- JSON, XML, and CSV, [JSON, XML, and Binary Variants](/en/ch5#sec_encoding_json)
- language-specific formats, [Language-Specific Formats](/en/ch5#id96)
- merits of schemas, [The Merits of Schemas](/en/ch5#sec_encoding_schemas)
- Protocol Buffers, [Protocol Buffers](/en/ch5#sec_encoding_protobuf)-[Field tags and schema evolution](/en/ch5#field-tags-and-schema-evolution)
- representations of data, [Formats for Encoding Data](/en/ch5#sec_encoding_formats)
- end-to-end argument, [The end-to-end argument](/en/ch13#sec_future_e2e_argument)-[Applying end-to-end thinking in data systems](/en/ch13#id357)
- checking integrity, [The end-to-end argument again](/en/ch13#id456)
- publish/subscribe streams, [End-to-end event streams](/en/ch13#id349)
- enrichment (stream), [Stream-table join (stream enrichment)](/en/ch12#sec_stream_table_joins)
- Enterprise JavaBeans (EJB), [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc)
- enterprise software, [Trade-offs in Data Systems Architecture](/en/ch1#ch_tradeoffs)
- entities (see vertices)
- ephemeral storage, [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute)
- epoch (consensus algorithms), [From single-leader replication to consensus](/en/ch10#from-single-leader-replication-to-consensus)
- epoch (Unix timestamps), [Time-of-day clocks](/en/ch9#time-of-day-clocks)
- erasure coding (error correction), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- error handling
- for network faults, [Network Faults in Practice](/en/ch9#sec_distributed_network_faults)
- in transactions, [Handling errors and aborts](/en/ch8#handling-errors-and-aborts)
- error-correcting codes, [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- Esper (CEP engine), [Complex event processing](/en/ch12#id317)
- essential complexity, [Simplicity: Managing Complexity](/en/ch2#id38)
- etcd (coordination service), [Coordination Services](/en/ch10#sec_consistency_coordination)-[Service discovery](/en/ch10#service-discovery)
- generating fencing tokens, [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens), [Coordination Services](/en/ch10#sec_consistency_coordination)
- linearizable operations, [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable), [Subtleties of consensus](/en/ch10#subtleties-of-consensus)
- locks and leader election, [Locking and leader election](/en/ch10#locking-and-leader-election)
- use for service discovery, [Load balancers, service discovery, and service meshes](/en/ch5#sec_encoding_service_discovery), [Service discovery](/en/ch10#service-discovery)
- use for shard assignment, [Request Routing](/en/ch7#sec_sharding_routing)
- use of Raft algorithm, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- Ethereum (blockchain), [Tools for auditable data systems](/en/ch13#id366)
- Ethernet (networks), [Cloud Computing Versus Supercomputing](/en/ch1#id17), [Unreliable Networks](/en/ch9#sec_distributed_networks), [Can we not simply make network delays predictable?](/en/ch9#can-we-not-simply-make-network-delays-predictable)
- packet checksums, [Weak forms of lying](/en/ch9#weak-forms-of-lying), [The end-to-end argument](/en/ch13#sec_future_e2e_argument)
- ethics, [Doing the Right Thing](/en/ch14)-[Legislation and Self-Regulation](/en/ch14#sec_future_legislation)
- code of ethics and professional practice, [Doing the Right Thing](/en/ch14)
- legislation and self-regulation, [Legislation and Self-Regulation](/en/ch14#sec_future_legislation)
- predictive analytics, [Predictive Analytics](/en/ch14#id369)-[Feedback Loops](/en/ch14#id372)
- amplifying bias, [Bias and Discrimination](/en/ch14#id370)
- feedback loops, [Feedback Loops](/en/ch14#id372)
- privacy and tracking, [Privacy and Tracking](/en/ch14#id373)-[Legislation and Self-Regulation](/en/ch14#sec_future_legislation)
- consent and freedom of choice, [Consent and Freedom of Choice](/en/ch14#id375)
- data as assets and power, [Data as Assets and Power](/en/ch14#id376)
- meaning of privacy, [Privacy and Use of Data](/en/ch14#id457)
- surveillance, [Surveillance](/en/ch14#id374)
- respect, dignity, and agency, [Legislation and Self-Regulation](/en/ch14#sec_future_legislation)
- unintended consequences, [Doing the Right Thing](/en/ch14), [Feedback Loops](/en/ch14#id372)
- ETL (extract-transform-load), [Data Warehousing](/en/ch1#sec_introduction_dwh), [Keeping Systems in Sync](/en/ch12#sec_stream_sync), [Glossary](/en/glossary)
- relation to batch processing, [Extract--Transform--Load (ETL)](/en/ch11#sec_batch_etl_usage)-[Extract--Transform--Load (ETL)](/en/ch11#sec_batch_etl_usage)
- using batch processing, [Batch Processing](/en/ch11#ch_batch)
- Euclidean distance (semantic search), [Vector Embeddings](/en/ch4#id92)
- European Union
- AI Act (see AI Act)
- GDPR (see GDPR)
- event sourcing, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)-[Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- and change data capture, [Change data capture versus event sourcing](/en/ch12#sec_stream_event_sourcing)
- comparison to change data capture, [Change data capture versus event sourcing](/en/ch12#sec_stream_event_sourcing)
- immutability and auditability, [State, Streams, and Immutability](/en/ch12#sec_stream_immutability), [Designing for auditability](/en/ch13#id365)
- large, reliable data systems, [Uniquely identifying requests](/en/ch13#id355), [Correctness of dataflow systems](/en/ch13#id453)
- reliance on determinism, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- event streams (see streams)
- event-driven architecture, [Event-Driven Architectures](/en/ch5#sec_encoding_dataflow_msg)-[Distributed actor frameworks](/en/ch5#distributed-actor-frameworks)
- distributed actor frameworks, [Distributed actor frameworks](/en/ch5#distributed-actor-frameworks)
- events, [Transmitting Event Streams](/en/ch12#sec_stream_transmit)
- deciding on total order of, [The limits of total ordering](/en/ch13#id335)
- deriving views from event log, [Deriving several views from the same event log](/en/ch12#sec_stream_deriving_views)
- event time versus processing time, [Event time versus processing time](/en/ch12#id322), [Microbatching and checkpointing](/en/ch12#id329), [Unifying batch and stream processing](/en/ch13#id338)
- immutable, advantages of, [Advantages of immutable events](/en/ch12#sec_stream_immutability_pros), [Designing for auditability](/en/ch13#id365)
- ordering to capture causality, [Ordering events to capture causality](/en/ch13#sec_future_capture_causality)
- reads as, [Reads are events too](/en/ch13#sec_future_read_events)
- stragglers, [Handling straggler events](/en/ch12#id323)
- timestamp of, in stream processing, [Whose clock are you using, anyway?](/en/ch12#id438)
- EventSource (browser API), [Pushing state changes to clients](/en/ch13#id348)
- EventStoreDB (database), [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- eventual consistency, [Replication](/en/ch6#ch_replication), [Problems with Replication Lag](/en/ch6#sec_replication_lag), [Safety and liveness](/en/ch9#sec_distributed_safety_liveness)
- (see also conflicts)
- and perpetual inconsistency, [Timeliness and Integrity](/en/ch13#sec_future_integrity)
- strong eventual consistency, [Automatic conflict resolution](/en/ch6#automatic-conflict-resolution)
- evidence
- data used as, [Humans and Reliability](/en/ch2#id31)
- evolvability, [Evolvability: Making Change Easy](/en/ch2#sec_introduction_evolvability), [Encoding and Evolution](/en/ch5#ch_encoding)
- calling services, [Data encoding and evolution for RPC](/en/ch5#data-encoding-and-evolution-for-rpc)
- event sourcing, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- graph-structured data, [Property Graphs](/en/ch3#id56)
- of databases, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility), [Dataflow Through Databases](/en/ch5#sec_encoding_dataflow_db)-[Archival storage](/en/ch5#archival-storage), [Deriving several views from the same event log](/en/ch12#sec_stream_deriving_views), [Reprocessing data for application evolution](/en/ch13#sec_future_reprocessing)
- reprocessing data, [Reprocessing data for application evolution](/en/ch13#sec_future_reprocessing), [Unifying batch and stream processing](/en/ch13#id338)
- schema evolution in Avro, [The writer's schema and the reader's schema](/en/ch5#the-writers-schema-and-the-readers-schema)
- schema evolution in Protocol Buffers, [Field tags and schema evolution](/en/ch5#field-tags-and-schema-evolution)
- schema-on-read, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility), [Encoding and Evolution](/en/ch5#ch_encoding), [The Merits of Schemas](/en/ch5#sec_encoding_schemas)
- exactly-once semantics, [Exactly-once message processing](/en/ch8#sec_transactions_exactly_once), [Exactly-once message processing revisited](/en/ch8#exactly-once-message-processing-revisited), [Fault Tolerance](/en/ch12#sec_stream_fault_tolerance), [Exactly-once execution of an operation](/en/ch13#id353)
- parity with batch processors, [Unifying batch and stream processing](/en/ch13#id338)
- preservation of integrity, [Correctness of dataflow systems](/en/ch13#id453)
- using durable execution, [Durable execution](/en/ch5#durable-execution)
- exclusive mode (locks), [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- exponential backoff, [Describing Performance](/en/ch2#sec_introduction_percentiles), [Handling errors and aborts](/en/ch8#handling-errors-and-aborts)
- ext4 (file system), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- eXtended Architecture transactions (see XA transactions)
- extract-transform-load (see ETL)
### F
- Facebook
- Faiss (vector index), [Vector Embeddings](/en/ch4#id92)
- React (user interface library), [End-to-end event streams](/en/ch13#id349)
- social graphs, [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- facts
- fact table (star schema), [Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)
- in Datalog, [Datalog: Recursive Relational Queries](/en/ch3#id62)
- in event sourcing, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- fail-slow faults, [System Model and Reality](/en/ch9#sec_distributed_system_model)
- fail-stop model, [System Model and Reality](/en/ch9#sec_distributed_system_model)
- failover, [Leader failure: Failover](/en/ch6#leader-failure-failover), [Glossary](/en/glossary)
- (see also leader-based replication)
- in leaderless replication, absence of, [Writing to the Database When a Node Is Down](/en/ch6#id287)
- leader election, [Distributed Locks and Leases](/en/ch9#sec_distributed_lock_fencing), [Consensus](/en/ch10#sec_consistency_consensus), [From single-leader replication to consensus](/en/ch10#from-single-leader-replication-to-consensus)
- potential problems, [Leader failure: Failover](/en/ch6#leader-failure-failover)
- failures
- amplification by distributed transactions, [Maintaining derived state](/en/ch13#id446)
- failure detection, [Detecting Faults](/en/ch9#id307)
- automatic rebalancing causing cascading failures, [Operations: Automatic or Manual Rebalancing](/en/ch7#sec_sharding_operations)
- timeouts and unbounded delays, [Timeouts and Unbounded Delays](/en/ch9#sec_distributed_queueing), [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- using a coordination service, [Coordination Services](/en/ch10#sec_consistency_coordination)
- faults versus, [Reliability and Fault Tolerance](/en/ch2#sec_introduction_reliability)
- partial failures, [Faults and Partial Failures](/en/ch9#sec_distributed_partial_failure), [Summary](/en/ch9#summary)
- Faiss (vector index), [Vector Embeddings](/en/ch4#id92)
- false positive (Bloom filters), [Bloom filters](/en/ch4#bloom-filters)
- fan-out (messaging systems), [Materializing and Updating Timelines](/en/ch2#sec_introduction_materializing), [Multiple consumers](/en/ch12#id298)
- fault injection, [Fault Tolerance](/en/ch2#id27), [Network Faults in Practice](/en/ch9#sec_distributed_network_faults), [Fault injection](/en/ch9#sec_fault_injection)
- fault isolation, [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- fault tolerance, [Reliability and Fault Tolerance](/en/ch2#sec_introduction_reliability)-[Humans and Reliability](/en/ch2#id31), [Glossary](/en/glossary)
- formalization in consensus, [Single-value consensus](/en/ch10#single-value-consensus)
- human fault tolerance, [Batch Processing](/en/ch11#ch_batch)
- in batch processing, [Handling Faults](/en/ch11#id281)
- in log-based systems, [Applying end-to-end thinking in data systems](/en/ch13#id357), [Timeliness and Integrity](/en/ch13#sec_future_integrity)-[Correctness of dataflow systems](/en/ch13#id453)
- in stream processing, [Fault Tolerance](/en/ch12#sec_stream_fault_tolerance)-[Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance)
- atomic commit, [Atomic commit revisited](/en/ch12#sec_stream_atomic_commit)
- idempotence, [Idempotence](/en/ch12#sec_stream_idempotence)
- maintaining derived state, [Maintaining derived state](/en/ch13#id446)
- microbatching and checkpointing, [Microbatching and checkpointing](/en/ch12#id329)
- rebuilding state after a failure, [Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance)
- of distributed transactions, [XA transactions](/en/ch8#xa-transactions)-[Exactly-once message processing revisited](/en/ch8#exactly-once-message-processing-revisited)
- of leader-based and leaderless replication, [Single-Leader Versus Leaderless Replication Performance](/en/ch6#sec_replication_leaderless_perf)
- transaction atomicity, [Atomicity](/en/ch8#sec_transactions_acid_atomicity), [Distributed Transactions](/en/ch8#sec_transactions_distributed)-[Exactly-once message processing](/en/ch8#sec_transactions_exactly_once)
- faults
- Byzantine faults, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)-[Weak forms of lying](/en/ch9#weak-forms-of-lying)
- failures versus, [Reliability and Fault Tolerance](/en/ch2#sec_introduction_reliability)
- handled by transactions, [Transactions](/en/ch8#ch_transactions)
- handling in supercomputers and cloud computing, [Cloud Computing Versus Supercomputing](/en/ch1#id17)
- hardware, [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults)
- in distributed systems, [Faults and Partial Failures](/en/ch9#sec_distributed_partial_failure)
- introducing deliberately (see fault injection)
- network faults, [Network Faults in Practice](/en/ch9#sec_distributed_network_faults)-[Detecting Faults](/en/ch9#id307)
- asymmetric faults, [The Majority Rules](/en/ch9#sec_distributed_majority)
- detecting, [Detecting Faults](/en/ch9#id307)
- tolerance of, in multi-leader replication, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)
- software faults, [Software faults](/en/ch2#software-faults)
- tolerating (see fault tolerance)
- feature engineering (machine learning), [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake)
- federated databases, [The meta-database of everything](/en/ch13#id341)
- Feldera (database)
- incremental view maintenance, [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- fence (CPU instruction), [Linearizability and network delays](/en/ch10#linearizability-and-network-delays)
- fencing (preventing split brain), [Leader failure: Failover](/en/ch6#leader-failure-failover), [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens)-[Fencing with multiple replicas](/en/ch9#fencing-with-multiple-replicas)
- generating fencing tokens, [Using shared logs](/en/ch10#sec_consistency_smr), [Coordination Services](/en/ch10#sec_consistency_coordination)
- properties of fencing tokens, [Defining the correctness of an algorithm](/en/ch9#defining-the-correctness-of-an-algorithm)
- stream processors writing to databases, [Idempotence](/en/ch12#sec_stream_idempotence), [Exactly-once execution of an operation](/en/ch13#id353)
- fetch-and-add
- relation to consensus, [Fetch-and-add as consensus](/en/ch10#fetch-and-add-as-consensus)
- Fibre Channel (networks), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- field tags (Protocol Buffers), [Protocol Buffers](/en/ch5#sec_encoding_protobuf)-[Field tags and schema evolution](/en/ch5#field-tags-and-schema-evolution)
- Figma (graphics software), [Real-time collaboration, offline-first, and local-first apps](/en/ch6#real-time-collaboration-offline-first-and-local-first-apps)
- filesystem in userspace (FUSE), [Setting Up New Followers](/en/ch6#sec_replication_new_replica), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- on object storage, [Object Stores](/en/ch11#id277)
- financial data
- accounting ledgers, [Summary](/en/ch3#summary)
- immutability, [Advantages of immutable events](/en/ch12#sec_stream_immutability_pros)
- time series data, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- Fivetran, [Data Warehousing](/en/ch1#sec_introduction_dwh)
- FizzBee (specification language), [Model checking and specification languages](/en/ch9#model-checking-and-specification-languages)
- flat index (vector index), [Vector Embeddings](/en/ch4#id92)
- FlatBuffers (data format), [Formats for Encoding Data](/en/ch5#sec_encoding_formats)
- Flink (processing framework), [Batch Processing](/en/ch11#ch_batch), [Dataflow Engines](/en/ch11#sec_batch_dataflow)
- cost efficiency, [Query languages](/en/ch11#sec_batch_query_lanauges)
- DataFrames, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes), [DataFrames](/en/ch11#id287)
- fault tolerance, [Handling Faults](/en/ch11#id281), [Microbatching and checkpointing](/en/ch12#id329), [Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance)
- FlinkML, [Machine Learning](/en/ch11#id290)
- for data warehouses, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- high availability using ZooKeeper, [Coordination Services](/en/ch10#sec_consistency_coordination)
- integration of batch and stream processing, [Unifying batch and stream processing](/en/ch13#id338)
- query optimizer, [Query languages](/en/ch11#sec_batch_query_lanauges)
- shuffling data, [Shuffling Data](/en/ch11#sec_shuffle)
- stream processing, [Stream analytics](/en/ch12#id318)
- streaming SQL support, [Complex event processing](/en/ch12#id317)
- flow control, [The Limitations of TCP](/en/ch9#sec_distributed_tcp), [Messaging Systems](/en/ch12#sec_stream_messaging), [Glossary](/en/glossary)
- FLP result (on consensus), [Consensus](/en/ch10#sec_consistency_consensus)
- Flyte (workflow scheduler), [Machine Learning](/en/ch11#id290)
- followers, [Single-Leader Replication](/en/ch6#sec_replication_leader), [Glossary](/en/glossary)
- (see also leader-based replication)
- formal methods, [Formal Methods and Randomized Testing](/en/ch9#sec_distributed_formal)-[Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- forward compatibility, [Encoding and Evolution](/en/ch5#ch_encoding)
- forward decay (algorithm), [Use of Response Time Metrics](/en/ch2#sec_introduction_slo_sla)
- Fossil (version control system), [Concurrency control](/en/ch12#sec_stream_concurrency)
- shunning (deleting data), [Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- FoundationDB (database)
- consistency model, [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- deterministic simulation testing, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- key-range sharding, [Sharding by Key Range](/en/ch7#sec_sharding_key_range)
- process-per-core model, [Pros and Cons of Sharding](/en/ch7#sec_sharding_reasons)
- serializable transactions, [Serializable Snapshot Isolation (SSI)](/en/ch8#sec_transactions_ssi), [Performance of serializable snapshot isolation](/en/ch8#performance-of-serializable-snapshot-isolation)
- transactions, [What Exactly Is a Transaction?](/en/ch8#sec_transactions_overview), [Database-internal Distributed Transactions](/en/ch8#sec_transactions_internal)
- fractional indexing, [When to Use Which Model](/en/ch3#sec_datamodels_document_summary)
- fragmentation (of B-trees), [Disk space usage](/en/ch4#disk-space-usage)
- frame (computer graphics), [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- frontend (web development), [Trade-offs in Data Systems Architecture](/en/ch1#ch_tradeoffs)
- FrostDB (database)
- deterministic simulation testing (DST), [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- fsync (system call), [Making B-trees reliable](/en/ch4#sec_storage_btree_wal), [Durability](/en/ch8#durability)
- full-text search, [Full-Text Search](/en/ch4#sec_storage_full_text), [Glossary](/en/glossary)
- and fuzzy indexes, [Full-Text Search](/en/ch4#sec_storage_full_text)
- Lucene storage engine, [Full-Text Search](/en/ch4#sec_storage_full_text)
- sharded indexes, [Sharding and Secondary Indexes](/en/ch7#sec_sharding_secondary_indexes)
- Function as a Service (FaaS), [Microservices and Serverless](/en/ch1#sec_introduction_microservices)
- functional programming
- inspiration for MapReduce, [MapReduce](/en/ch11#sec_batch_mapreduce)
- functional requirements, [Defining Nonfunctional Requirements](/en/ch2#ch_nonfunctional)
- FUSE (see filesystem in userspace (FUSE))
- fuzzing, [Formal Methods and Randomized Testing](/en/ch9#sec_distributed_formal)
- fuzzy search (see similarity search)
### G
- Gallina (specification language), [Model checking and specification languages](/en/ch9#model-checking-and-specification-languages)
- game development, [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- garbage collection
- immutability and, [Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- process pauses for, [Latency and Response Time](/en/ch2#id23), [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)-[Limiting the impact of garbage collection](/en/ch9#sec_distributed_gc_impact), [The Majority Rules](/en/ch9#sec_distributed_majority)
- (see also process pauses)
- gas stations algorithmic pricing, [Feedback Loops](/en/ch14#id372)
- GDPR (regulation), [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance), [Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- consent, [Consent and Freedom of Choice](/en/ch14#id375)
- data minimization, [Legislation and Self-Regulation](/en/ch14#sec_future_legislation)
- legitimate interest, [Consent and Freedom of Choice](/en/ch14#id375)
- right of access, [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- right to erasure, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance), [Disk space usage](/en/ch4#disk-space-usage), [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- GenBank (genome database), [Summary](/en/ch3#summary)
- General Data Protection Regulation (see GDPR (regulation))
- genome analysis, [Summary](/en/ch3#summary)
- geographic distribution (see regions (geographic distribution))
- geospatial indexes, [Multidimensional and Full-Text Indexes](/en/ch4#sec_storage_multidimensional)
- Git (version control system), [Concurrency control](/en/ch12#sec_stream_concurrency)
- local-first software, [Real-time collaboration, offline-first, and local-first apps](/en/ch6#real-time-collaboration-offline-first-and-local-first-apps)
- merge conflicts, [Manual conflict resolution](/en/ch6#manual-conflict-resolution)
- GitHub, postmortems, [Leader failure: Failover](/en/ch6#leader-failure-failover), [Leader failure: Failover](/en/ch6#leader-failure-failover), [Mapping system models to the real world](/en/ch9#mapping-system-models-to-the-real-world)
- global secondary indexes, [Global Secondary Indexes](/en/ch7#id167), [Summary](/en/ch7#summary)
- globally unique identifiers (see UUIDs)
- GlusterFS (distributed filesystem), [Batch Processing](/en/ch11#ch_batch), [Distributed Filesystems](/en/ch11#sec_batch_dfs), [Object Stores](/en/ch11#id277)
- GNU Coreutils (Linux), [Sorting Versus In-memory Aggregation](/en/ch11#id275)
- Go (programming language)
- garbage collection, [Limiting the impact of garbage collection](/en/ch9#sec_distributed_gc_impact)
- GoldenGate (change data capture), [Implementing change data capture](/en/ch12#id307)
- (see also Oracle)
- Google
- BigQuery (see BigQuery (database))
- Bigtable (see Bigtable (database))
- Chubby (lock service), [Coordination Services](/en/ch10#sec_consistency_coordination)
- Cloud Storage (object storage), [Setting Up New Followers](/en/ch6#sec_replication_new_replica), [Object Stores](/en/ch11#id277)
- request preconditions, [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens)
- Compute Engine
- preemptible instances, [Handling Faults](/en/ch11#id281)
- Dataflow (stream processing)
- data warehouse integration, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- shuffling data, [Shuffling Data](/en/ch11#sec_shuffle)
- Dataflow (stream processor), [Stream analytics](/en/ch12#id318), [Atomic commit revisited](/en/ch12#sec_stream_atomic_commit), [Unifying batch and stream processing](/en/ch13#id338)
- (see also Beam)
- Datastream (change data capture), [API support for change streams](/en/ch12#sec_stream_change_api)
- Docs (collaborative editor), [Real-time collaboration, offline-first, and local-first apps](/en/ch6#real-time-collaboration-offline-first-and-local-first-apps), [CRDTs and Operational Transformation](/en/ch6#sec_replication_crdts)
- operational transformation, [CRDTs and Operational Transformation](/en/ch6#sec_replication_crdts)
- Dremel (query engine), [Column-Oriented Storage](/en/ch4#sec_storage_column)
- Firestore (database), [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- MapReduce (batch processing), [Batch Processing](/en/ch11#ch_batch)
- (see also MapReduce)
- Percolator (transaction system), [Implementing a linearizable ID generator](/en/ch10#implementing-a-linearizable-id-generator)
- persistent disks (cloud service), [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute)
- Pub/Sub (messaging), [Message brokers](/en/ch5#message-brokers), [Message brokers compared to databases](/en/ch12#id297), [Using logs for message storage](/en/ch12#id300)
- response time study, [Average, Median, and Percentiles](/en/ch2#id24)
- Sheets (collaborative spreadsheet), [Real-time collaboration, offline-first, and local-first apps](/en/ch6#real-time-collaboration-offline-first-and-local-first-apps), [CRDTs and Operational Transformation](/en/ch6#sec_replication_crdts)
- Spanner (see Spanner (database))
- TrueTime (clock API), [Clock readings with a confidence interval](/en/ch9#clock-readings-with-a-confidence-interval)
- gossip protocol, [Request Routing](/en/ch7#sec_sharding_routing)
- governance, [Beyond the data lake](/en/ch1#beyond-the-data-lake)
- government use of data, [Data as Assets and Power](/en/ch14#id376)
- GPS (Global Positioning System)
- use for clock synchronization, [Unreliable Clocks](/en/ch9#sec_distributed_clocks), [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy), [Clock readings with a confidence interval](/en/ch9#clock-readings-with-a-confidence-interval), [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- GPT (language model), [Vector Embeddings](/en/ch4#id92)
- GPU (graphics processing unit), [Layering of cloud services](/en/ch1#layering-of-cloud-services), [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed)
- gradual rollout (see rolling upgrades)
- GraphQL (query language), [GraphQL](/en/ch3#id63)
- validation, [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs)
- graphs, [Glossary](/en/glossary)
- as data models, [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)-[GraphQL](/en/ch3#id63)
- property graphs, [Property Graphs](/en/ch3#id56)
- RDF and triple-stores, [Triple-Stores and SPARQL](/en/ch3#id59)-[The SPARQL query language](/en/ch3#the-sparql-query-language)
- DAGs (see directed acyclic graphs)
- processing and analysis, [Machine Learning](/en/ch11#id290)
- query languages
- Cypher, [The Cypher Query Language](/en/ch3#id57)
- Datalog, [Datalog: Recursive Relational Queries](/en/ch3#id62)-[Datalog: Recursive Relational Queries](/en/ch3#id62)
- GraphQL, [GraphQL](/en/ch3#id63)
- Gremlin, [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- recursive SQL queries, [Graph Queries in SQL](/en/ch3#id58)
- SPARQL, [The SPARQL query language](/en/ch3#the-sparql-query-language)-[The SPARQL query language](/en/ch3#the-sparql-query-language)
- traversal, [Property Graphs](/en/ch3#id56)
- gray failures, [System Model and Reality](/en/ch9#sec_distributed_system_model)
- in leaderless replication, [Single-Leader Versus Leaderless Replication Performance](/en/ch6#sec_replication_leaderless_perf)
- Gremlin (graph query language), [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- grep (Unix tool), [Simple Log Analysis](/en/ch11#sec_batch_log_analysis)
- gRPC (service calls), [Microservices and Serverless](/en/ch1#sec_introduction_microservices), [Web services](/en/ch5#sec_web_services)
- forward and backward compatibility, [Data encoding and evolution for RPC](/en/ch5#data-encoding-and-evolution-for-rpc)
- GUIDs (see UUIDs)
### H
- Hadoop (data infrastructure)
- comparison to distributed databases, [Batch Processing](/en/ch11#ch_batch)
- MapReduce (see MapReduce)
- NodeManager, [Distributed Job Orchestration](/en/ch11#id278)
- YARN (see YARN (job scheduler))
- HANA (see SAP HANA (database))
- happens-before relation, [The "happens-before" relation and concurrency](/en/ch6#sec_replication_happens_before)
- hard disks
- access patterns, [Sequential versus random writes](/en/ch4#sidebar_sequential)
- detecting corruption, [The end-to-end argument](/en/ch13#sec_future_e2e_argument), [Don't just blindly trust what they promise](/en/ch13#id364)
- faults in, [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults), [Durability](/en/ch8#durability)
- sequential vs. random writes, [Sequential versus random writes](/en/ch4#sidebar_sequential)
- sequential write throughput, [Disk space usage](/en/ch12#sec_stream_disk_usage)
- hardware faults, [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults)
- hash function
- in Bloom filters, [Bloom filters](/en/ch4#bloom-filters)
- hash join
- in stream processing, [Stream-table join (stream enrichment)](/en/ch12#sec_stream_table_joins)
- hash sharding, [Sharding by Hash of Key](/en/ch7#sec_sharding_hash)-[Consistent hashing](/en/ch7#sec_sharding_consistent_hashing), [Summary](/en/ch7#summary)
- consistent hashing, [Consistent hashing](/en/ch7#sec_sharding_consistent_hashing)
- problems with hash mod N, [Hash modulo number of nodes](/en/ch7#hash-modulo-number-of-nodes)
- range queries, [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- suitable hash functions, [Sharding by Hash of Key](/en/ch7#sec_sharding_hash)
- with fixed number of shards, [Fixed number of shards](/en/ch7#fixed-number-of-shards)
- hash tables, [Log-Structured Storage](/en/ch4#sec_storage_log_structured)
- Hazelcast (in-memory data grid)
- FencedLock, [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens)
- Flake ID Generator, [ID Generators and Logical Clocks](/en/ch10#sec_consistency_logical)
- HBase (database)
- bug due to lack of fencing, [Distributed Locks and Leases](/en/ch9#sec_distributed_lock_fencing)
- key-range sharding, [Sharding by Key Range](/en/ch7#sec_sharding_key_range)
- log-structured storage, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- regions (sharding), [Sharding](/en/ch7#ch_sharding)
- request routing, [Request Routing](/en/ch7#sec_sharding_routing)
- size-tiered compaction, [Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- wide-column data model, [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality), [Column Compression](/en/ch4#sec_storage_column_compression)
- HDFS (Hadoop Distributed File System), [Batch Processing](/en/ch11#ch_batch), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- (see also distributed filesystems)
- checking data integrity, [Don't just blindly trust what they promise](/en/ch13#id364)
- DataNode, [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- NameNode, [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- use in MapReduce, [MapReduce](/en/ch11#sec_batch_mapreduce)
- workflow example, [Scheduling Workflows](/en/ch11#sec_batch_workflows)
- HdrHistogram (numerical library), [Use of Response Time Metrics](/en/ch2#sec_introduction_slo_sla)
- head (Unix tool), [Simple Log Analysis](/en/ch11#sec_batch_log_analysis), [Distributed Job Orchestration](/en/ch11#id278)
- head vertex (property graphs), [Property Graphs](/en/ch3#id56)
- head-of-line blocking, [Latency and Response Time](/en/ch2#id23)
- heap files (databases), [Storing values within the index](/en/ch4#sec_storage_index_heap)
- in multiversion concurrency control, [Multi-version concurrency control (MVCC)](/en/ch8#sec_transactions_snapshot_impl)
- heat management, [Skewed Workloads and Relieving Hot Spots](/en/ch7#sec_sharding_skew)
- hedged requests, [Single-Leader Versus Leaderless Replication Performance](/en/ch6#sec_replication_leaderless_perf)
- heterogeneous distributed transactions, [Distributed Transactions Across Different Systems](/en/ch8#sec_transactions_xa), [Problems with XA transactions](/en/ch8#problems-with-xa-transactions)
- heuristic decisions (in 2PC), [Recovering from coordinator failure](/en/ch8#recovering-from-coordinator-failure)
- Hex (notebook), [Machine Learning](/en/ch11#id290)
- hexagons
- for geospatial indexing, [Multidimensional and Full-Text Indexes](/en/ch4#sec_storage_multidimensional)
- Hibernate (object-relational mapper), [Object-relational mapping (ORM)](/en/ch3#object-relational-mapping-orm)
- hierarchical model, [Relational Model versus Document Model](/en/ch3#sec_datamodels_history)
- hierarchical navigable small world (vector index), [Vector Embeddings](/en/ch4#id92)
- hierarchical queries (see recursive common table expressions)
- high availability (see fault tolerance)
- high-frequency trading, [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy)
- high-performance computing (HPC), [Cloud Computing Versus Supercomputing](/en/ch1#id17)
- hinted handoff (leaderless replication), [Catching up on missed writes](/en/ch6#sec_replication_read_repair)
- histograms, [Use of Response Time Metrics](/en/ch2#sec_introduction_slo_sla)
- Hive (data warehouse), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- query optimizer, [Query languages](/en/ch11#sec_batch_query_lanauges)
- HNSW (vector index), [Vector Embeddings](/en/ch4#id92)
- hopping windows (stream processing), [Types of windows](/en/ch12#id324)
- (see also windows)
- Hoptimator (query engine), [The meta-database of everything](/en/ch13#id341)
- Horizon scandal, [Humans and Reliability](/en/ch2#id31)
- lack of transactions, [Transactions](/en/ch8#ch_transactions)
- horizontal scaling (see scaling out)
- by sharding, [Pros and Cons of Sharding](/en/ch7#sec_sharding_reasons)
- HornetQ (messaging), [Message brokers](/en/ch5#message-brokers), [Message brokers compared to databases](/en/ch12#id297)
- distributed transaction support, [XA transactions](/en/ch8#xa-transactions)
- hot keys, [Sharding of Key-Value Data](/en/ch7#sec_sharding_key_value)
- hot spots, [Sharding of Key-Value Data](/en/ch7#sec_sharding_key_value)
- due to celebrities, [Skewed Workloads and Relieving Hot Spots](/en/ch7#sec_sharding_skew)
- for time-series data, [Sharding by Key Range](/en/ch7#sec_sharding_key_range)
- relieving, [Skewed Workloads and Relieving Hot Spots](/en/ch7#sec_sharding_skew)
- hot standbys (see leader-based replication)
- HTAP (see hybrid transactional/analytic processing)
- HTTP, use in APIs (see services)
- human errors, [Humans and Reliability](/en/ch2#id31), [Network Faults in Practice](/en/ch9#sec_distributed_network_faults), [Batch Processing](/en/ch11#ch_batch)
- hybrid logical clocks, [Hybrid logical clocks](/en/ch10#hybrid-logical-clocks)
- hybrid transactional/analytic processing, [Data Warehousing](/en/ch1#sec_introduction_dwh), [Data Storage for Analytics](/en/ch4#sec_storage_analytics)
- hydrating IDs (join), [Denormalization in the social networking case study](/en/ch3#denormalization-in-the-social-networking-case-study)
- hypergraph, [Property Graphs](/en/ch3#id56)
- HyperLogLog (algorithm), [Stream analytics](/en/ch12#id318)
### I
- I/O operations, waiting for, [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- IaaS (see infrastructure as a service (IaaS))
- IBM
- Db2 (database)
- distributed transaction support, [XA transactions](/en/ch8#xa-transactions)
- serializable isolation, [Snapshot isolation, repeatable read, and naming confusion](/en/ch8#snapshot-isolation-repeatable-read-and-naming-confusion), [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- MQ (messaging), [Message brokers compared to databases](/en/ch12#id297)
- distributed transaction support, [XA transactions](/en/ch8#xa-transactions)
- System R (database), [What Exactly Is a Transaction?](/en/ch8#sec_transactions_overview)
- WebSphere (messaging), [Message brokers](/en/ch5#message-brokers)
- Iceberg (table format), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- databases on object storage, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- log-based message broker storage, [Disk space usage](/en/ch12#sec_stream_disk_usage)
- idempotence, [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc), [Idempotence](/en/ch12#sec_stream_idempotence), [Glossary](/en/glossary)
- by giving operations unique IDs, [Multi-shard request processing](/en/ch13#id360)
- by giving requests unique IDs, [Uniquely identifying requests](/en/ch13#id355)
- for exactly-once semantics, [Exactly-once message processing revisited](/en/ch8#exactly-once-message-processing-revisited)
- idempotent operations, [Exactly-once execution of an operation](/en/ch13#id353)
- in workflow engines, [Durable execution](/en/ch5#durable-execution)
- immutability
- advantages of, [Advantages of immutable events](/en/ch12#sec_stream_immutability_pros), [Designing for auditability](/en/ch13#id365)
- and right to erasure, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance), [Disk space usage](/en/ch4#disk-space-usage)
- crypto-shredding for deletion, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events), [Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- deriving state from event log, [State, Streams, and Immutability](/en/ch12#sec_stream_immutability)-[Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- for crash recovery, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- in B-trees, [B-tree variants](/en/ch4#b-tree-variants), [Indexes and snapshot isolation](/en/ch8#indexes-and-snapshot-isolation)
- in event sourcing, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events), [Change data capture versus event sourcing](/en/ch12#sec_stream_event_sourcing)
- limitations of, [Concurrency control](/en/ch12#sec_stream_concurrency)
- impedance mismatch, [The Object-Relational Mismatch](/en/ch3#sec_datamodels_document)
- in doubt (transaction status), [Coordinator failure](/en/ch8#coordinator-failure)
- holding locks, [Holding locks while in doubt](/en/ch8#holding-locks-while-in-doubt)
- orphaned transactions, [Recovering from coordinator failure](/en/ch8#recovering-from-coordinator-failure)
- in-memory databases, [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- durability, [Durability](/en/ch8#durability)
- serial transaction execution, [Actual Serial Execution](/en/ch8#sec_transactions_serial)
- incidents
- accounting software bugs leading to wrongful convictions, [Humans and Reliability](/en/ch2#id31)
- blameless postmortems, [Humans and Reliability](/en/ch2#id31)
- crashes due to leap seconds, [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy)
- data corruption and financial losses due to concurrency bugs, [Weak Isolation Levels](/en/ch8#sec_transactions_isolation_levels)
- data corruption on hard disks, [Durability](/en/ch8#durability)
- data loss due to last-write-wins, [Timestamps for ordering events](/en/ch9#sec_distributed_lww)
- data on disks unreadable, [Mapping system models to the real world](/en/ch9#mapping-system-models-to-the-real-world)
- disclosure of sensitive data due to primary key reuse, [Leader failure: Failover](/en/ch6#leader-failure-failover)
- errors in transaction serializability, [Maintaining integrity in the face of software bugs](/en/ch13#id455)
- gigabit network interface with 1 Kb/s throughput, [System Model and Reality](/en/ch9#sec_distributed_system_model)
- leap second crash, [Software faults](/en/ch2#software-faults)
- network faults, [Network Faults in Practice](/en/ch9#sec_distributed_network_faults)
- network interface dropping only inbound packets, [Network Faults in Practice](/en/ch9#sec_distributed_network_faults)
- network partitions and whole-datacenter failures, [Faults and Partial Failures](/en/ch9#sec_distributed_partial_failure)
- poor handling of network faults, [Network Faults in Practice](/en/ch9#sec_distributed_network_faults)
- sending message to ex-partner, [Ordering events to capture causality](/en/ch13#sec_future_capture_causality)
- sharks biting undersea cables, [Network Faults in Practice](/en/ch9#sec_distributed_network_faults)
- split brain due to 1-minute packet delay, [Leader failure: Failover](/en/ch6#leader-failure-failover), [Network Faults in Practice](/en/ch9#sec_distributed_network_faults)
- SSD failure after 32,768 hours, [Software faults](/en/ch2#software-faults)
- thread contention bringing down a service, [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- vibrations in server rack, [Latency and Response Time](/en/ch2#id23)
- violation of uniqueness constraint, [Maintaining integrity in the face of software bugs](/en/ch13#id455)
- incremental view maintenance (IVM), [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- for data integration, [Unbundled versus integrated systems](/en/ch13#id448)
- indexes, [Storage and Indexing for OLTP](/en/ch4#sec_storage_oltp), [Glossary](/en/glossary)
- and snapshot isolation, [Indexes and snapshot isolation](/en/ch8#indexes-and-snapshot-isolation)
- as derived data, [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived), [Composing Data Storage Technologies](/en/ch13#id447)-[Unbundled versus integrated systems](/en/ch13#id448)
- B-trees, [B-Trees](/en/ch4#sec_storage_b_trees)-[B-tree variants](/en/ch4#b-tree-variants)
- clustered, [Storing values within the index](/en/ch4#sec_storage_index_heap)
- comparison of B-trees and LSM-trees, [Comparing B-Trees and LSM-Trees](/en/ch4#sec_storage_btree_lsm_comparison)-[Disk space usage](/en/ch4#disk-space-usage)
- covering (with included columns), [Storing values within the index](/en/ch4#sec_storage_index_heap)
- creating, [Creating an index](/en/ch13#id340)
- full-text search, [Full-Text Search](/en/ch4#sec_storage_full_text)
- geospatial, [Multidimensional and Full-Text Indexes](/en/ch4#sec_storage_multidimensional)
- index-range locking, [Index-range locks](/en/ch8#sec_transactions_2pl_range)
- multi-column (concatenated), [Multidimensional and Full-Text Indexes](/en/ch4#sec_storage_multidimensional)
- secondary, [Multi-Column and Secondary Indexes](/en/ch4#sec_storage_index_multicolumn)
- (see also secondary indexes)
- problems with dual writes, [Keeping Systems in Sync](/en/ch12#sec_stream_sync), [Reasoning about dataflows](/en/ch13#id443)
- sharding and secondary indexes, [Sharding and Secondary Indexes](/en/ch7#sec_sharding_secondary_indexes)-[Global Secondary Indexes](/en/ch7#id167), [Summary](/en/ch7#summary)
- sparse, [The SSTable file format](/en/ch4#the-sstable-file-format)
- SSTables and LSM-trees, [The SSTable file format](/en/ch4#the-sstable-file-format)-[Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- updating when data changes, [Keeping Systems in Sync](/en/ch12#sec_stream_sync), [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- Industrial Revolution, [Remembering the Industrial Revolution](/en/ch14#id377)
- InfiniBand (networks), [Can we not simply make network delays predictable?](/en/ch9#can-we-not-simply-make-network-delays-predictable)
- InfluxDB IOx (storage engine), [Column-Oriented Storage](/en/ch4#sec_storage_column)
- information retrieval (see full-text search)
- infrastructure as a service (IaaS), [Cloud Versus Self-Hosting](/en/ch1#sec_introduction_cloud), [Layering of cloud services](/en/ch1#layering-of-cloud-services)
- InnoDB (storage engine)
- clustered index on primary key, [Storing values within the index](/en/ch4#sec_storage_index_heap)
- not preventing lost updates, [Automatically detecting lost updates](/en/ch8#automatically-detecting-lost-updates)
- preventing write skew, [Characterizing write skew](/en/ch8#characterizing-write-skew), [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- serializable isolation, [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- snapshot isolation support, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)
- instance (cloud computing), [Layering of cloud services](/en/ch1#layering-of-cloud-services)
- integrating different data systems (see data integration)
- integrity, [Timeliness and Integrity](/en/ch13#sec_future_integrity)
- coordination-avoiding data systems, [Coordination-avoiding data systems](/en/ch13#id454)
- correctness of dataflow systems, [Correctness of dataflow systems](/en/ch13#id453)
- in consensus formalization, [Single-value consensus](/en/ch10#single-value-consensus), [Atomic commitment as consensus](/en/ch10#atomic-commitment-as-consensus)
- integrity checks, [Don't just blindly trust what they promise](/en/ch13#id364)
- (see also auditing)
- end-to-end, [The end-to-end argument](/en/ch13#sec_future_e2e_argument), [The end-to-end argument again](/en/ch13#id456)
- use of snapshot isolation, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)
- maintaining despite software bugs, [Maintaining integrity in the face of software bugs](/en/ch13#id455)
- Interface Definition Language (IDL), [Protocol Buffers](/en/ch5#sec_encoding_protobuf), [Avro](/en/ch5#sec_encoding_avro), [Web services](/en/ch5#sec_web_services)
- invariants, [Consistency](/en/ch8#sec_transactions_acid_consistency)
- (see also constraints)
- inverted file index (vector index), [Vector Embeddings](/en/ch4#id92)
- inverted index, [Full-Text Search](/en/ch4#sec_storage_full_text)
- irreversibility, minimizing, [Evolvability: Making Change Easy](/en/ch2#sec_introduction_evolvability), [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events), [Batch Processing](/en/ch11#ch_batch)
- ISDN (Integrated Services Digital Network), [Synchronous Versus Asynchronous Networks](/en/ch9#sec_distributed_sync_networks)
- isolation (in operating systems)
- cgroups (see cgroups)
- isolation (in transactions), [Isolation](/en/ch8#sec_transactions_acid_isolation), [Single-Object and Multi-Object Operations](/en/ch8#sec_transactions_multi_object), [Glossary](/en/glossary)
- correctness and, [Aiming for Correctness](/en/ch13#sec_future_correctness)
- for single-object writes, [Single-object writes](/en/ch8#sec_transactions_single_object)
- serializability, [Serializability](/en/ch8#sec_transactions_serializability)-[Performance of serializable snapshot isolation](/en/ch8#performance-of-serializable-snapshot-isolation)
- actual serial execution, [Actual Serial Execution](/en/ch8#sec_transactions_serial)-[Summary of serial execution](/en/ch8#summary-of-serial-execution)
- serializable snapshot isolation (SSI), [Serializable Snapshot Isolation (SSI)](/en/ch8#sec_transactions_ssi)-[Performance of serializable snapshot isolation](/en/ch8#performance-of-serializable-snapshot-isolation)
- two-phase locking (2PL), [Two-Phase Locking (2PL)](/en/ch8#sec_transactions_2pl)-[Index-range locks](/en/ch8#sec_transactions_2pl_range)
- violating, [Single-Object and Multi-Object Operations](/en/ch8#sec_transactions_multi_object)
- weak isolation levels, [Weak Isolation Levels](/en/ch8#sec_transactions_isolation_levels)-[Materializing conflicts](/en/ch8#materializing-conflicts)
- preventing lost updates, [Preventing Lost Updates](/en/ch8#sec_transactions_lost_update)-[Conflict resolution and replication](/en/ch8#conflict-resolution-and-replication)
- read committed, [Read Committed](/en/ch8#sec_transactions_read_committed)-[Implementing read committed](/en/ch8#sec_transactions_read_committed_impl)
- snapshot isolation, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)-[Snapshot isolation, repeatable read, and naming confusion](/en/ch8#snapshot-isolation-repeatable-read-and-naming-confusion)
- IVF (vector index), [Vector Embeddings](/en/ch4#id92)
### J
- Java Database Connectivity (JDBC)
- distributed transaction support, [XA transactions](/en/ch8#xa-transactions)
- network drivers, [The Merits of Schemas](/en/ch5#sec_encoding_schemas)
- Java Enterprise Edition (EE), [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc), [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc), [XA transactions](/en/ch8#xa-transactions)
- Java Message Service (JMS), [Message brokers compared to databases](/en/ch12#id297)
- (see also messaging systems)
- comparison to log-based messaging, [Logs compared to traditional messaging](/en/ch12#sec_stream_logs_vs_messaging), [Replaying old messages](/en/ch12#sec_stream_replay)
- distributed transaction support, [XA transactions](/en/ch8#xa-transactions)
- message ordering, [Acknowledgments and redelivery](/en/ch12#sec_stream_reordering)
- Java Transaction API (JTA), [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc), [XA transactions](/en/ch8#xa-transactions)
- Java Virtual Machine (JVM)
- garbage collection, [Process Pauses](/en/ch9#sec_distributed_clocks_pauses), [Limiting the impact of garbage collection](/en/ch9#sec_distributed_gc_impact)
- JIT compilation, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- process reuse in batch processors, [Dataflow Engines](/en/ch11#sec_batch_dataflow)
- Jena (RDF framework), [The RDF data model](/en/ch3#the-rdf-data-model)
- SPARQL query language, [The SPARQL query language](/en/ch3#the-sparql-query-language)
- Jepsen (fault tolerance testing), [Fault injection](/en/ch9#sec_fault_injection), [Aiming for Correctness](/en/ch13#sec_future_correctness)
- jitter (network delay), [Average, Median, and Percentiles](/en/ch2#id24), [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- JMESPath (query language), [Query languages](/en/ch11#sec_batch_query_lanauges)
- join table, [Many-to-One and Many-to-Many Relationships](/en/ch3#sec_datamodels_many_to_many), [Property Graphs](/en/ch3#id56)
- joins, [Glossary](/en/glossary)
- expressing as relational operators, [Query languages](/en/ch11#sec_batch_query_lanauges)
- handling GraphQL query, [GraphQL](/en/ch3#id63)
- in application code, [Normalization, Denormalization, and Joins](/en/ch3#sec_datamodels_normalization), [Denormalization in the social networking case study](/en/ch3#denormalization-in-the-social-networking-case-study)
- in DataFrames, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- in relational and document databases, [Normalization, Denormalization, and Joins](/en/ch3#sec_datamodels_normalization)
- secondary indexes and, [Multi-Column and Secondary Indexes](/en/ch4#sec_storage_index_multicolumn)
- sort-merge joins, [JOIN and GROUP BY](/en/ch11#sec_batch_join)
- stream joins, [Stream Joins](/en/ch12#sec_stream_joins)-[Time-dependence of joins](/en/ch12#sec_stream_join_time)
- stream-stream join, [Stream-stream join (window join)](/en/ch12#id440)
- stream-table join, [Stream-table join (stream enrichment)](/en/ch12#sec_stream_table_joins)
- table-table join, [Table-table join (materialized view maintenance)](/en/ch12#id326)
- time-dependence of, [Time-dependence of joins](/en/ch12#sec_stream_join_time)
- support in document databases, [Convergence of document and relational databases](/en/ch3#convergence-of-document-and-relational-databases)
- JOTM (transaction coordinator), [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc)
- journaling (filesystems), [Making B-trees reliable](/en/ch4#sec_storage_btree_wal)
- JSON
- aggregation pipeline (query language), [Query languages for documents](/en/ch3#query-languages-for-documents)
- Avro schema representation, [Avro](/en/ch5#sec_encoding_avro)
- binary variants, [Binary encoding](/en/ch5#binary-encoding)
- data locality, [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality)
- document data model, [Relational Model versus Document Model](/en/ch3#sec_datamodels_history)
- for application data, issues with, [JSON, XML, and Binary Variants](/en/ch5#sec_encoding_json)
- GraphQL response, [GraphQL](/en/ch3#id63)
- in relational databases, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility)
- representing a résumé (example), [The document data model for one-to-many relationships](/en/ch3#the-document-data-model-for-one-to-many-relationships)
- Schema, [JSON Schema](/en/ch5#json-schema)
- JSON-LD, [Triple-Stores and SPARQL](/en/ch3#id59)
- JsonPath (query language), [Query languages](/en/ch11#sec_batch_query_lanauges)
- JuiceFS (distributed filesystem), [Distributed Filesystems](/en/ch11#sec_batch_dfs), [Object Stores](/en/ch11#id277)
- Jupyter (notebook), [Machine Learning](/en/ch11#id290)
- just-in-time (JIT) compilation, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
### K
- Kafka (messaging), [Message brokers](/en/ch5#message-brokers), [Using logs for message storage](/en/ch12#id300)
- consumer groups, [Multiple consumers](/en/ch12#id298)
- for data integration, [Unbundled versus integrated systems](/en/ch13#id448)
- for event sourcing, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- Kafka Connect (database integration), [Implementing change data capture](/en/ch12#id307), [API support for change streams](/en/ch12#sec_stream_change_api), [Deriving several views from the same event log](/en/ch12#sec_stream_deriving_views)
- Kafka Streams (stream processor), [Stream analytics](/en/ch12#id318), [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- exactly-once semantics, [Exactly-once message processing revisited](/en/ch8#exactly-once-message-processing-revisited)
- fault tolerance, [Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance)
- ksqlDB (stream database), [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- leader-based replication, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- log compaction, [Log compaction](/en/ch12#sec_stream_log_compaction), [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- message offsets, [Using logs for message storage](/en/ch12#id300), [Idempotence](/en/ch12#sec_stream_idempotence)
- partitions (sharding), [Sharding](/en/ch7#ch_sharding)
- request routing, [Request Routing](/en/ch7#sec_sharding_routing)
- schema registry, [But what is the writer's schema?](/en/ch5#but-what-is-the-writers-schema)
- serving derived data, [Serving Derived Data](/en/ch11#sec_batch_serving_derived)
- tiered storage, [Disk space usage](/en/ch12#sec_stream_disk_usage)
- transactions, [Database-internal Distributed Transactions](/en/ch8#sec_transactions_internal), [Atomic commit revisited](/en/ch12#sec_stream_atomic_commit)
- unclean leader election, [Subtleties of consensus](/en/ch10#subtleties-of-consensus)
- use of model-checking, [Model checking and specification languages](/en/ch9#model-checking-and-specification-languages)
- kappa architecture, [Unifying batch and stream processing](/en/ch13#id338)
- key-value stores, [Storage and Indexing for OLTP](/en/ch4#sec_storage_oltp)
- comparison to object stores, [Object Stores](/en/ch11#id277)
- in-memory, [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- LSM storage, [Log-Structured Storage](/en/ch4#sec_storage_log_structured)-[Disk space usage](/en/ch4#disk-space-usage)
- sharding, [Sharding of Key-Value Data](/en/ch7#sec_sharding_key_value)-[Skewed Workloads and Relieving Hot Spots](/en/ch7#sec_sharding_skew)
- by hash of key, [Sharding by Hash of Key](/en/ch7#sec_sharding_hash), [Summary](/en/ch7#summary)
- by key range, [Sharding by Key Range](/en/ch7#sec_sharding_key_range), [Summary](/en/ch7#summary)
- skew and hot spots, [Skewed Workloads and Relieving Hot Spots](/en/ch7#sec_sharding_skew)
- Kinesis (messaging), [Message brokers](/en/ch5#message-brokers), [Using logs for message storage](/en/ch12#id300)
- data warehouse integration, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- Kryo (Java), [Language-Specific Formats](/en/ch5#id96)
- ksqlDB (stream database), [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- Kubernetes (cluster manager), [Cloud Versus Self-Hosting](/en/ch1#sec_introduction_cloud), [Microservices and Serverless](/en/ch1#sec_introduction_microservices), [Distributed Job Orchestration](/en/ch11#id278), [Separation of application code and state](/en/ch13#id344)
- Kubeflow, [Machine Learning](/en/ch11#id290)
- kubelet, [Distributed Job Orchestration](/en/ch11#id278)
- operators, [Distributed Job Orchestration](/en/ch11#id278)
- use of etcd, [Request Routing](/en/ch7#sec_sharding_routing), [Coordination Services](/en/ch10#sec_consistency_coordination)
- KùzuDB (database), [Problems with Distributed Systems](/en/ch1#sec_introduction_dist_sys_problems), [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- as embedded storage engine, [Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- Cypher query language, [The Cypher Query Language](/en/ch3#id57)
### L
- labeled property graphs (see property graphs)
- lambda architecture, [Unifying batch and stream processing](/en/ch13#id338)
- Lamport timestamps, [Lamport timestamps](/en/ch10#lamport-timestamps)
- Lance (data format), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses), [Column-Oriented Storage](/en/ch4#sec_storage_column)
- (see also column-oriented storage)
- large language models (LLMs)
- pre-processing training data, [Machine Learning](/en/ch11#id290)
- last write wins (LWW), [Last write wins (discarding concurrent writes)](/en/ch6#sec_replication_lww), [Detecting Concurrent Writes](/en/ch6#sec_replication_concurrent), [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable)
- problems with, [Timestamps for ordering events](/en/ch9#sec_distributed_lww)
- prone to lost updates, [Conflict resolution and replication](/en/ch8#conflict-resolution-and-replication)
- latency, [Latency and Response Time](/en/ch2#id23)
- (see also response time)
- across regions, [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed)
- instability under two-phase locking, [Performance of two-phase locking](/en/ch8#performance-of-two-phase-locking)
- network latency and resource utilization, [Can we not simply make network delays predictable?](/en/ch9#can-we-not-simply-make-network-delays-predictable)
- reducing by request hedging, [Single-Leader Versus Leaderless Replication Performance](/en/ch6#sec_replication_leaderless_perf)
- response time versus, [Latency and Response Time](/en/ch2#id23)
- tail latency, [Average, Median, and Percentiles](/en/ch2#id24), [Use of Response Time Metrics](/en/ch2#sec_introduction_slo_sla), [Local Secondary Indexes](/en/ch7#id166)
- law (see legal matters)
- layering (of cloud services), [Layering of cloud services](/en/ch1#layering-of-cloud-services)
- leader-based replication, [Single-Leader Replication](/en/ch6#sec_replication_leader)-[Logical (row-based) log replication](/en/ch6#logical-row-based-log-replication)
- (see also replication)
- failover, [Leader failure: Failover](/en/ch6#leader-failure-failover), [Distributed Locks and Leases](/en/ch9#sec_distributed_lock_fencing)
- handling node outages, [Handling Node Outages](/en/ch6#sec_replication_failover)
- implementation of replication logs
- change data capture, [Change Data Capture](/en/ch12#sec_stream_cdc)-[API support for change streams](/en/ch12#sec_stream_change_api)
- (see also changelogs)
- statement-based, [Statement-based replication](/en/ch6#statement-based-replication)
- write-ahead log (WAL) shipping, [Write-ahead log (WAL) shipping](/en/ch6#write-ahead-log-wal-shipping)
- linearizability of operations, [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable)
- locking and leader election, [Locking and leader election](/en/ch10#locking-and-leader-election)
- log sequence number, [Setting Up New Followers](/en/ch6#sec_replication_new_replica), [Consumer offsets](/en/ch12#sec_stream_log_offsets)
- read-scaling architecture, [Problems with Replication Lag](/en/ch6#sec_replication_lag), [Single-Leader Versus Leaderless Replication Performance](/en/ch6#sec_replication_leaderless_perf)
- relation to consensus, [Consensus](/en/ch10#sec_consistency_consensus), [From single-leader replication to consensus](/en/ch10#from-single-leader-replication-to-consensus), [Pros and cons of consensus](/en/ch10#pros-and-cons-of-consensus)
- setting up new followers, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- synchronous versus asynchronous, [Synchronous Versus Asynchronous Replication](/en/ch6#sec_replication_sync_async)-[Synchronous Versus Asynchronous Replication](/en/ch6#sec_replication_sync_async)
- leaderless replication, [Leaderless Replication](/en/ch6#sec_replication_leaderless)-[Version vectors](/en/ch6#version-vectors)
- (see also replication)
- catching up on missed writes, [Catching up on missed writes](/en/ch6#sec_replication_read_repair)
- detecting concurrent writes, [Detecting Concurrent Writes](/en/ch6#sec_replication_concurrent)-[Version vectors](/en/ch6#version-vectors)
- version vectors, [Version vectors](/en/ch6#version-vectors)
- multi-region, [Multi-region operation](/en/ch6#multi-region-operation)
- quorums, [Quorums for reading and writing](/en/ch6#sec_replication_quorum_condition)-[Multi-region operation](/en/ch6#multi-region-operation)
- consistency limitations, [Limitations of Quorum Consistency](/en/ch6#sec_replication_quorum_limitations)-[Monitoring staleness](/en/ch6#monitoring-staleness), [Linearizability and quorums](/en/ch10#sec_consistency_quorum_linearizable)
- leap seconds, [Software faults](/en/ch2#software-faults), [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy)
- in time-of-day clocks, [Time-of-day clocks](/en/ch9#time-of-day-clocks)
- leases, [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- implementation with coordination service, [Coordination Services](/en/ch10#sec_consistency_coordination)
- need for fencing, [Distributed Locks and Leases](/en/ch9#sec_distributed_lock_fencing)
- relation to consensus, [Single-value consensus](/en/ch10#single-value-consensus)
- ledgers (accounting), [Summary](/en/ch3#summary)
- immutability, [Advantages of immutable events](/en/ch12#sec_stream_immutability_pros)
- legacy systems, maintenance of, [Maintainability](/en/ch2#sec_introduction_maintainability)
- legal matters, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance)-[Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance)
- data deletion, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance), [Disk space usage](/en/ch4#disk-space-usage)
- data residence, [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed), [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- privacy regulation, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance), [Legislation and Self-Regulation](/en/ch14#sec_future_legislation)
- legitimate interest (GDPR), [Consent and Freedom of Choice](/en/ch14#id375)
- leveled compaction, [Compaction strategies](/en/ch4#sec_storage_lsm_compaction), [Disk space usage](/en/ch4#disk-space-usage)
- Levenshtein automata, [Full-Text Search](/en/ch4#sec_storage_full_text)
- limping (partial failure), [System Model and Reality](/en/ch9#sec_distributed_system_model)
- Linear (project management software), [Real-time collaboration, offline-first, and local-first apps](/en/ch6#real-time-collaboration-offline-first-and-local-first-apps)
- linear algebra, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- linear scalability, [Describing Load](/en/ch2#id33)
- linearizability, [Solutions for Replication Lag](/en/ch6#id131), [Linearizability](/en/ch10#sec_consistency_linearizability)-[Linearizability and network delays](/en/ch10#linearizability-and-network-delays), [Glossary](/en/glossary)
- and consensus, [Consensus](/en/ch10#sec_consistency_consensus)
- cost of, [The Cost of Linearizability](/en/ch10#sec_linearizability_cost)-[Linearizability and network delays](/en/ch10#linearizability-and-network-delays)
- CAP theorem, [The CAP theorem](/en/ch10#the-cap-theorem)
- memory on multi-core CPUs, [Linearizability and network delays](/en/ch10#linearizability-and-network-delays)
- definition, [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)-[What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- ID generation, [Linearizable ID Generators](/en/ch10#sec_consistency_linearizable_id)
- in coordination services, [Coordination Services](/en/ch10#sec_consistency_coordination)
- of derived data systems
- avoiding coordination, [Coordination-avoiding data systems](/en/ch13#id454)
- of different replication methods, [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable)-[Linearizability and quorums](/en/ch10#sec_consistency_quorum_linearizable)
- using quorums, [Linearizability and quorums](/en/ch10#sec_consistency_quorum_linearizable)
- reads in consensus systems, [Subtleties of consensus](/en/ch10#subtleties-of-consensus)
- relying on, [Relying on Linearizability](/en/ch10#sec_consistency_linearizability_usage)-[Cross-channel timing dependencies](/en/ch10#cross-channel-timing-dependencies)
- constraints and uniqueness, [Constraints and uniqueness guarantees](/en/ch10#sec_consistency_uniqueness)
- cross-channel timing dependencies, [Cross-channel timing dependencies](/en/ch10#cross-channel-timing-dependencies)
- locking and leader election, [Locking and leader election](/en/ch10#locking-and-leader-election)
- versus serializability, [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- linked data, [Triple-Stores and SPARQL](/en/ch3#id59)
- LinkedIn
- Espresso (database), [But what is the writer's schema?](/en/ch5#but-what-is-the-writers-schema)
- LIquid (database), [Datalog: Recursive Relational Queries](/en/ch3#id62)
- profile (example), [The document data model for one-to-many relationships](/en/ch3#the-document-data-model-for-one-to-many-relationships)
- Linux, leap second bug, [Software faults](/en/ch2#software-faults), [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy)
- Litestream (backup tool), [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- liveness properties, [Safety and liveness](/en/ch9#sec_distributed_safety_liveness)
- LLVM (compiler), [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- LMDB (storage engine), [Compaction strategies](/en/ch4#sec_storage_lsm_compaction), [B-tree variants](/en/ch4#b-tree-variants), [Indexes and snapshot isolation](/en/ch8#indexes-and-snapshot-isolation)
- load
- coping with, [Principles for Scalability](/en/ch2#id35)
- describing, [Describing Load](/en/ch2#id33)
- load balancing, [Describing Performance](/en/ch2#sec_introduction_percentiles), [Load balancers, service discovery, and service meshes](/en/ch5#sec_encoding_service_discovery)
- in hardware, [Load balancers, service discovery, and service meshes](/en/ch5#sec_encoding_service_discovery)
- in software, [Load balancers, service discovery, and service meshes](/en/ch5#sec_encoding_service_discovery)
- using message brokers, [Multiple consumers](/en/ch12#id298)
- load shedding, [Describing Performance](/en/ch2#sec_introduction_percentiles)
- local secondary indexes, [Local Secondary Indexes](/en/ch7#id166), [Summary](/en/ch7#summary)
- local-first software, [Real-time collaboration, offline-first, and local-first apps](/en/ch6#real-time-collaboration-offline-first-and-local-first-apps)
- locality (data access), [The document data model for one-to-many relationships](/en/ch3#the-document-data-model-for-one-to-many-relationships), [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality), [Glossary](/en/glossary)
- in batch processing, [Dataflow Engines](/en/ch11#sec_batch_dataflow)
- in stateful clients, [Sync Engines and Local-First Software](/en/ch6#sec_replication_offline_clients), [Stateful, offline-capable clients](/en/ch13#id347)
- in stream processing, [Stream-table join (stream enrichment)](/en/ch12#sec_stream_table_joins), [Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance), [Stream processors and services](/en/ch13#id345), [Uniqueness in log-based messaging](/en/ch13#sec_future_uniqueness_log)
- location transparency, [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc)
- in the actor model, [Distributed actor frameworks](/en/ch5#distributed-actor-frameworks)
- lock-in, [Pros and Cons of Cloud Services](/en/ch1#sec_introduction_cloud_tradeoffs)
- locks, [Glossary](/en/glossary)
- deadlock, [Explicit locking](/en/ch8#explicit-locking), [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- distributed locking, [Distributed Locks and Leases](/en/ch9#sec_distributed_lock_fencing)-[Fencing with multiple replicas](/en/ch9#fencing-with-multiple-replicas), [Locking and leader election](/en/ch10#locking-and-leader-election)
- fencing tokens, [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens)
- implementation with coordination service, [Coordination Services](/en/ch10#sec_consistency_coordination)
- relation to consensus, [Single-value consensus](/en/ch10#single-value-consensus)
- for transaction isolation
- in snapshot isolation, [Multi-version concurrency control (MVCC)](/en/ch8#sec_transactions_snapshot_impl)
- in two-phase locking (2PL), [Two-Phase Locking (2PL)](/en/ch8#sec_transactions_2pl)-[Index-range locks](/en/ch8#sec_transactions_2pl_range)
- making operations atomic, [Atomic write operations](/en/ch8#atomic-write-operations)
- performance, [Performance of two-phase locking](/en/ch8#performance-of-two-phase-locking)
- preventing dirty writes, [Implementing read committed](/en/ch8#sec_transactions_read_committed_impl)
- preventing phantoms with index-range locks, [Index-range locks](/en/ch8#sec_transactions_2pl_range), [Detecting writes that affect prior reads](/en/ch8#sec_detecting_writes_affect_reads)
- read locks (shared mode), [Implementing read committed](/en/ch8#sec_transactions_read_committed_impl), [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- shared mode and exclusive mode, [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- in distributed transactions
- deadlock detection, [Problems with XA transactions](/en/ch8#problems-with-xa-transactions)
- in-doubt transactions holding locks, [Holding locks while in doubt](/en/ch8#holding-locks-while-in-doubt)
- materializing conflicts with, [Materializing conflicts](/en/ch8#materializing-conflicts)
- preventing lost updates by explicit locking, [Explicit locking](/en/ch8#explicit-locking)
- log sequence number, [Setting Up New Followers](/en/ch6#sec_replication_new_replica), [Consumer offsets](/en/ch12#sec_stream_log_offsets)
- logical clocks, [Timestamps for ordering events](/en/ch9#sec_distributed_lww), [ID Generators and Logical Clocks](/en/ch10#sec_consistency_logical)-[Enforcing constraints using logical clocks](/en/ch10#enforcing-constraints-using-logical-clocks), [Ordering events to capture causality](/en/ch13#sec_future_capture_causality)
- for last-write-wins, [Last write wins (discarding concurrent writes)](/en/ch6#sec_replication_lww)
- for read-after-write consistency, [Reading Your Own Writes](/en/ch6#sec_replication_ryw)
- hybrid logical clocks, [Hybrid logical clocks](/en/ch10#hybrid-logical-clocks)
- insufficiency for enforcing constraints, [Enforcing constraints using logical clocks](/en/ch10#enforcing-constraints-using-logical-clocks)
- Lamport timestamps, [Lamport timestamps](/en/ch10#lamport-timestamps)
- logical replication, [Logical (row-based) log replication](/en/ch6#logical-row-based-log-replication)
- for change data capture, [Implementing change data capture](/en/ch12#id307)
- LogicBlox (database), [Datalog: Recursive Relational Queries](/en/ch3#id62)
- logs (data structure), [Storage and Indexing for OLTP](/en/ch4#sec_storage_oltp), [Shared logs as consensus](/en/ch10#sec_consistency_shared_logs), [Glossary](/en/glossary)
- (see also shared logs)
- advantages of immutability, [Advantages of immutable events](/en/ch12#sec_stream_immutability_pros)
- and right to erasure, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance), [Disk space usage](/en/ch4#disk-space-usage)
- compaction, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables), [Compaction strategies](/en/ch4#sec_storage_lsm_compaction), [Log compaction](/en/ch12#sec_stream_log_compaction), [State, Streams, and Immutability](/en/ch12#sec_stream_immutability)
- for stream operator state, [Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance)
- implementing uniqueness constraints, [Uniqueness in log-based messaging](/en/ch13#sec_future_uniqueness_log)
- log-based messaging, [Log-based Message Brokers](/en/ch12#sec_stream_log)-[Replaying old messages](/en/ch12#sec_stream_replay)
- comparison to traditional messaging, [Logs compared to traditional messaging](/en/ch12#sec_stream_logs_vs_messaging), [Replaying old messages](/en/ch12#sec_stream_replay)
- consumer offsets, [Consumer offsets](/en/ch12#sec_stream_log_offsets)
- disk space usage, [Disk space usage](/en/ch12#sec_stream_disk_usage)
- replaying old messages, [Replaying old messages](/en/ch12#sec_stream_replay), [Reprocessing data for application evolution](/en/ch13#sec_future_reprocessing), [Unifying batch and stream processing](/en/ch13#id338)
- slow consumers, [When consumers cannot keep up with producers](/en/ch12#id459)
- using logs for message storage, [Using logs for message storage](/en/ch12#id300)
- log-structured storage, [Storage and Indexing for OLTP](/en/ch4#sec_storage_oltp)-[Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- log-structured merge tree (see LSM-trees)
- relation to consensus, [Shared logs as consensus](/en/ch10#sec_consistency_shared_logs)
- replication, [Single-Leader Replication](/en/ch6#sec_replication_leader), [Implementation of Replication Logs](/en/ch6#sec_replication_implementation)-[Logical (row-based) log replication](/en/ch6#logical-row-based-log-replication)
- change data capture, [Change Data Capture](/en/ch12#sec_stream_cdc)-[API support for change streams](/en/ch12#sec_stream_change_api)
- (see also changelogs)
- coordination with snapshot, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- logical (row-based) replication, [Logical (row-based) log replication](/en/ch6#logical-row-based-log-replication)
- statement-based replication, [Statement-based replication](/en/ch6#statement-based-replication)
- write-ahead log (WAL) shipping, [Write-ahead log (WAL) shipping](/en/ch6#write-ahead-log-wal-shipping)
- scalability limits, [The limits of total ordering](/en/ch13#id335)
- Looker (business intelligence software), [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp), [Analytics](/en/ch11#sec_batch_olap)
- loose coupling, [Making unbundling work](/en/ch13#sec_future_unbundling_favor)
- lost updates (see updates)
- Lotus Notes (sync engine), [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- LSM-trees (indexes), [The SSTable file format](/en/ch4#the-sstable-file-format)-[Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- comparison to B-trees, [Comparing B-Trees and LSM-Trees](/en/ch4#sec_storage_btree_lsm_comparison)-[Disk space usage](/en/ch4#disk-space-usage)
- Lucene (storage engine), [Full-Text Search](/en/ch4#sec_storage_full_text)
- similarity search, [Full-Text Search](/en/ch4#sec_storage_full_text)
- LWW (see last write wins)
### M
- machine learning
- batch inference, [Machine Learning](/en/ch11#id290)
- data preparation with DataFrames, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- deleting training data, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance)
- deploying data products, [Beyond the data lake](/en/ch1#beyond-the-data-lake)
- ethical considerations, [Predictive Analytics](/en/ch14#id369)
- (see also ethics)
- feature engineering, [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake), [Machine Learning](/en/ch11#id290)
- in analytics systems, [Operational Versus Analytical Systems](/en/ch1#sec_introduction_analytics)
- iterative processing, [Machine Learning](/en/ch11#id290)
- LLMs (see large language models (LLMs))
- models derived from training data, [Application code as a derivation function](/en/ch13#sec_future_dataflow_derivation)
- relation to batch processing, [Machine Learning](/en/ch11#id290)-[Machine Learning](/en/ch11#id290)
- using a data lake, [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake)
- using GPUs, [Layering of cloud services](/en/ch1#layering-of-cloud-services), [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed)
- using matrices, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- madsim (deterministic simulation testing), [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- magic scaling sauce, [Principles for Scalability](/en/ch2#id35)
- maintainability, [Maintainability](/en/ch2#sec_introduction_maintainability)-[Evolvability: Making Change Easy](/en/ch2#sec_introduction_evolvability), [A Philosophy of Streaming Systems](/en/ch13#ch_philosophy)
- evolvability (see evolvability)
- operability, [Operability: Making Life Easy for Operations](/en/ch2#id37)
- simplicity and managing complexity, [Simplicity: Managing Complexity](/en/ch2#id38)
- many-to-many relationships, [Many-to-One and Many-to-Many Relationships](/en/ch3#sec_datamodels_many_to_many)
- modeling as graphs, [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- many-to-one relationships, [Many-to-One and Many-to-Many Relationships](/en/ch3#sec_datamodels_many_to_many)
- in star schema, [Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)
- MapReduce (batch processing), [Batch Processing](/en/ch11#ch_batch), [MapReduce](/en/ch11#sec_batch_mapreduce)-[MapReduce](/en/ch11#sec_batch_mapreduce)
- analysis of user activity events (example), [JOIN and GROUP BY](/en/ch11#sec_batch_join)
- comparison to stream processing, [Processing Streams](/en/ch12#sec_stream_processing)
- disadvantages and limitations of, [MapReduce](/en/ch11#sec_batch_mapreduce)
- fault tolerance, [Handling Faults](/en/ch11#id281)
- higher-level tools, [Query languages](/en/ch11#sec_batch_query_lanauges)
- mapper and reducer functions, [MapReduce](/en/ch11#sec_batch_mapreduce)
- shuffling data, [Shuffling Data](/en/ch11#sec_shuffle)
- sort-merge joins, [JOIN and GROUP BY](/en/ch11#sec_batch_join)
- workflows, [Scheduling Workflows](/en/ch11#sec_batch_workflows)
- (see also workflow engines)
- marshalling (see encoding)
- MartenDB (database), [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- master-slave replication (obsolete term), [Single-Leader Replication](/en/ch6#sec_replication_leader)
- materialization, [Glossary](/en/glossary)
- aggregate values, [Materialized Views and Data Cubes](/en/ch4#sec_storage_materialized_views)
- conflicts, [Materializing conflicts](/en/ch8#materializing-conflicts)
- materialized views, [Materialized Views and Data Cubes](/en/ch4#sec_storage_materialized_views)
- as derived data, [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived), [Composing Data Storage Technologies](/en/ch13#id447)-[Unbundled versus integrated systems](/en/ch13#id448)
- in event sourcing, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- incremental view maintenance, [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- (see also incremental view maintenance (IVM))
- maintaining, using stream processing, [Maintaining materialized views](/en/ch12#sec_stream_mat_view), [Table-table join (materialized view maintenance)](/en/ch12#id326)
- social network timeline example, [Materializing and Updating Timelines](/en/ch2#sec_introduction_materializing)
- Materialize (database), [Materialized Views and Data Cubes](/en/ch4#sec_storage_materialized_views)
- incremental view maintenance, [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- matrices, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- sparse, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- Maxwell (change data capture), [Implementing change data capture](/en/ch12#id307)
- mean, [Average, Median, and Percentiles](/en/ch2#id24)
- media monitoring, [Search on streams](/en/ch12#id320)
- median, [Average, Median, and Percentiles](/en/ch2#id24)
- meeting room booking (example), [More examples of write skew](/en/ch8#more-examples-of-write-skew), [Predicate locks](/en/ch8#predicate-locks), [Enforcing Constraints](/en/ch13#sec_future_constraints)
- Memcached (caching server), [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- Memgraph (database), [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- Cypher query language, [The Cypher Query Language](/en/ch3#id57)
- memory
- barrier (CPU instruction), [Linearizability and network delays](/en/ch10#linearizability-and-network-delays)
- corruption, [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults)
- in-memory databases, [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- durability, [Durability](/en/ch8#durability)
- serial transaction execution, [Actual Serial Execution](/en/ch8#sec_transactions_serial)
- in-memory representation of data, [Formats for Encoding Data](/en/ch5#sec_encoding_formats)
- memtable (in LSM-trees), [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- random bit-flips in, [Trust, but Verify](/en/ch13#sec_future_verification)
- use by indexes, [Log-Structured Storage](/en/ch4#sec_storage_log_structured)
- memtable (in LSM-trees), [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- Mercurial (version control system), [Concurrency control](/en/ch12#sec_stream_concurrency)
- merge (DataFrame operator), [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- merging sorted files, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables), [Shuffling Data](/en/ch11#sec_shuffle)
- Merkle trees, [Tools for auditable data systems](/en/ch13#id366)
- Mesos (cluster manager), [Separation of application code and state](/en/ch13#id344)
- message brokers (see messaging systems)
- message-passing (see event-driven architecture)
- MessagePack (encoding format), [Binary encoding](/en/ch5#binary-encoding)
- messaging systems, [Stream Processing](/en/ch12#ch_stream)-[Replaying old messages](/en/ch12#sec_stream_replay)
- (see also streams)
- backpressure, buffering, or dropping messages, [Messaging Systems](/en/ch12#sec_stream_messaging)
- brokerless messaging, [Direct messaging from producers to consumers](/en/ch12#id296)
- event logs, [Log-based Message Brokers](/en/ch12#sec_stream_log)-[Replaying old messages](/en/ch12#sec_stream_replay)
- as data model, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- comparison to traditional messaging, [Logs compared to traditional messaging](/en/ch12#sec_stream_logs_vs_messaging), [Replaying old messages](/en/ch12#sec_stream_replay)
- consumer offsets, [Consumer offsets](/en/ch12#sec_stream_log_offsets)
- replaying old messages, [Replaying old messages](/en/ch12#sec_stream_replay), [Reprocessing data for application evolution](/en/ch13#sec_future_reprocessing), [Unifying batch and stream processing](/en/ch13#id338)
- slow consumers, [When consumers cannot keep up with producers](/en/ch12#id459)
- exactly-once semantics, [Exactly-once message processing](/en/ch8#sec_transactions_exactly_once), [Exactly-once message processing revisited](/en/ch8#exactly-once-message-processing-revisited), [Fault Tolerance](/en/ch12#sec_stream_fault_tolerance)
- message brokers, [Message brokers](/en/ch12#id433)-[Acknowledgments and redelivery](/en/ch12#sec_stream_reordering)
- acknowledgements and redelivery, [Acknowledgments and redelivery](/en/ch12#sec_stream_reordering)
- comparison to event logs, [Logs compared to traditional messaging](/en/ch12#sec_stream_logs_vs_messaging), [Replaying old messages](/en/ch12#sec_stream_replay)
- multiple consumers of same topic, [Multiple consumers](/en/ch12#id298)
- versus RPC, [Event-Driven Architectures](/en/ch5#sec_encoding_dataflow_msg)
- message loss, [Messaging Systems](/en/ch12#sec_stream_messaging)
- reliability, [Messaging Systems](/en/ch12#sec_stream_messaging)
- uniqueness in log-based messaging, [Uniqueness in log-based messaging](/en/ch13#sec_future_uniqueness_log)
- metastable failure, [Describing Performance](/en/ch2#sec_introduction_percentiles)
- metered billing
- serverless, [Microservices and Serverless](/en/ch1#sec_introduction_microservices)
- storage, [Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
- microbatching, [Microbatching and checkpointing](/en/ch12#id329)
- microservices, [Microservices and Serverless](/en/ch1#sec_introduction_microservices)
- (see also services)
- causal dependencies across services, [The limits of total ordering](/en/ch13#id335)
- loose coupling, [Making unbundling work](/en/ch13#sec_future_unbundling_favor)
- relation to batch/stream processors, [Batch Processing](/en/ch11#ch_batch), [Stream processors and services](/en/ch13#id345)
- Microsoft
- Azure Blob Storage (see Azure Blob Storage)
- Azure managed disks, [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute)
- Azure Service Bus (messaging), [Message brokers](/en/ch5#message-brokers), [Message brokers compared to databases](/en/ch12#id297)
- Azure SQL DB (database), [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native)
- Azure Storage, [Object Stores](/en/ch11#id277)
- Azure Stream Analytics, [Stream analytics](/en/ch12#id318)
- Azure Synapse Analytics (database), [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native)
- DCOM (Distributed Component Object Model), [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc)
- MSDTC (transaction coordinator), [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc)
- SQL Server (see SQL Server)
- Microsoft Power BI (see Power BI (business intelligence software))
- migrating (rewriting) data, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility), [Different values written at different times](/en/ch5#different-values-written-at-different-times), [Deriving several views from the same event log](/en/ch12#sec_stream_deriving_views), [Reprocessing data for application evolution](/en/ch13#sec_future_reprocessing)
- MinIO (object storage), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- mobile apps, [Trade-offs in Data Systems Architecture](/en/ch1#ch_tradeoffs)
- embedded databases, [Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- model checking, [Model checking and specification languages](/en/ch9#model-checking-and-specification-languages)
- modulus operator (%), [Hash modulo number of nodes](/en/ch7#hash-modulo-number-of-nodes)
- Mojo (programming language)
- memory management, [Limiting the impact of garbage collection](/en/ch9#sec_distributed_gc_impact)
- MongoDB (database)
- aggregation pipeline, [Query languages for documents](/en/ch3#query-languages-for-documents)
- atomic operations, [Atomic write operations](/en/ch8#atomic-write-operations)
- BSON, [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality)
- document data model, [Relational Model versus Document Model](/en/ch3#sec_datamodels_history)
- hash-range sharding, [Sharding by Hash of Key](/en/ch7#sec_sharding_hash), [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- in the cloud, [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native)
- join support, [Convergence of document and relational databases](/en/ch3#convergence-of-document-and-relational-databases)
- joins (\$lookup operator), [Normalization, Denormalization, and Joins](/en/ch3#sec_datamodels_normalization)
- JSON Schema validation, [JSON Schema](/en/ch5#json-schema)
- leader-based replication, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- ObjectIds, [ID Generators and Logical Clocks](/en/ch10#sec_consistency_logical)
- range-based sharding, [Sharding by Key Range](/en/ch7#sec_sharding_key_range)
- request routing, [Request Routing](/en/ch7#sec_sharding_routing)
- secondary indexes, [Local Secondary Indexes](/en/ch7#id166)
- shard splitting, [Rebalancing key-range sharded data](/en/ch7#rebalancing-key-range-sharded-data)
- stored procedures, [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs)
- monitoring, [Operations in the Cloud Era](/en/ch1#sec_introduction_operations), [Humans and Reliability](/en/ch2#id31), [Operability: Making Life Easy for Operations](/en/ch2#id37)
- monotonic clocks, [Monotonic clocks](/en/ch9#monotonic-clocks)
- monotonic reads, [Monotonic Reads](/en/ch6#sec_replication_monotonic_reads)
- Morel (query language), [Query languages](/en/ch11#sec_batch_query_lanauges)
- MSMQ (messaging), [XA transactions](/en/ch8#xa-transactions)
- multi-column indexes, [Multidimensional and Full-Text Indexes](/en/ch4#sec_storage_multidimensional)
- multi-leader replication, [Multi-Leader Replication](/en/ch6#sec_replication_multi_leader)-[Types of conflict](/en/ch6#sec_replication_write_conflicts)
- (see also replication)
- collaborative editing, [Real-time collaboration, offline-first, and local-first apps](/en/ch6#real-time-collaboration-offline-first-and-local-first-apps)
- conflict detection, [Types of conflict](/en/ch6#sec_replication_write_conflicts)
- conflict resolution, [Dealing with Conflicting Writes](/en/ch6#sec_replication_write_conflicts)
- for multi-region replication, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc), [The Cost of Linearizability](/en/ch10#sec_linearizability_cost)
- linearizability, lack of, [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable)
- offline-capable clients, [Sync Engines and Local-First Software](/en/ch6#sec_replication_offline_clients)
- replication topologies, [Multi-leader replication topologies](/en/ch6#sec_replication_topologies)-[Problems with different topologies](/en/ch6#problems-with-different-topologies)
- multi-object transactions, [Single-Object and Multi-Object Operations](/en/ch8#sec_transactions_multi_object)
- need for, [The need for multi-object transactions](/en/ch8#sec_transactions_need)
- Multi-Paxos (consensus algorithm), [Consensus in Practice](/en/ch10#sec_consistency_total_order)
- multi-reader single-writer lock, [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- multi-table index cluster tables (Oracle), [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality)
- multi-version concurrency control (MVCC), [Multi-version concurrency control (MVCC)](/en/ch8#sec_transactions_snapshot_impl), [Summary](/en/ch8#summary)
- detecting stale MVCC reads, [Detecting stale MVCC reads](/en/ch8#detecting-stale-mvcc-reads)
- indexes and snapshot isolation, [Indexes and snapshot isolation](/en/ch8#indexes-and-snapshot-isolation)
- using synchronized clocks, [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- multidimensional arrays, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- multitenancy, [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute), [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- by sharding, [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- using embedded databases, [Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- versus Byzantine fault tolerance, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)
- mutual exclusion, [Pessimistic versus optimistic concurrency control](/en/ch8#pessimistic-versus-optimistic-concurrency-control)
- (see also locks)
- MySQL (database)
- archiving WAL to object stores, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- binlog coordinates, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- change data capture, [Implementing change data capture](/en/ch12#id307), [API support for change streams](/en/ch12#sec_stream_change_api)
- circular replication topology, [Multi-leader replication topologies](/en/ch6#sec_replication_topologies)
- consistent snapshots, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- distributed transaction support, [XA transactions](/en/ch8#xa-transactions)
- global transaction identifiers (GTIDs), [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- in the cloud, [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native)
- InnoDB storage engine (see InnoDB)
- leader-based replication, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- multi-leader replication, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)
- row-based replication, [Logical (row-based) log replication](/en/ch6#logical-row-based-log-replication)
- sharding (see Vitess (database))
- snapshot isolation support, [Snapshot isolation, repeatable read, and naming confusion](/en/ch8#snapshot-isolation-repeatable-read-and-naming-confusion)
- (see also InnoDB)
- statement-based replication, [Statement-based replication](/en/ch6#statement-based-replication)
### N
- N+1 query problem, [Object-relational mapping (ORM)](/en/ch3#object-relational-mapping-orm)
- nanomsg (messaging library), [Direct messaging from producers to consumers](/en/ch12#id296)
- Narayana (transaction coordinator), [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc)
- NATS (messaging), [Message brokers](/en/ch5#message-brokers)
- natural language processing, [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake)
- Neo4j (database)
- Cypher query language, [The Cypher Query Language](/en/ch3#id57)
- graph data model, [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- Neon (database), [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- Nephele (dataflow engine), [Dataflow Engines](/en/ch11#sec_batch_dataflow)
- Neptune (graph database), [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- Cypher query language, [The Cypher Query Language](/en/ch3#id57)
- SPARQL query language, [The SPARQL query language](/en/ch3#the-sparql-query-language)
- netcode (game development), [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- Network Attached Storage (NAS), [Shared-Memory, Shared-Disk, and Shared-Nothing Architecture](/en/ch2#sec_introduction_shared_nothing), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- network model (data representation), [Relational Model versus Document Model](/en/ch3#sec_datamodels_history)
- Network Time Protocol (see NTP)
- networks
- congestion and queueing, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- datacenter network topologies, [Cloud Computing Versus Supercomputing](/en/ch1#id17)
- faults (see faults)
- linearizability and network delays, [Linearizability and network delays](/en/ch10#linearizability-and-network-delays)
- network partitions, [Network Faults in Practice](/en/ch9#sec_distributed_network_faults)
- in CAP theorem, [The Cost of Linearizability](/en/ch10#sec_linearizability_cost)
- timeouts and unbounded delays, [Timeouts and Unbounded Delays](/en/ch9#sec_distributed_queueing)
- NewSQL, [Relational Model versus Document Model](/en/ch3#sec_datamodels_history), [Solutions for Replication Lag](/en/ch6#id131)
- transactions and, [What Exactly Is a Transaction?](/en/ch8#sec_transactions_overview), [Database-internal Distributed Transactions](/en/ch8#sec_transactions_internal)
- next-key locking, [Index-range locks](/en/ch8#sec_transactions_2pl_range)
- NFS (network file system), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- on object storage, [Object Stores](/en/ch11#id277)
- Nimble (data format), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses), [Column-Oriented Storage](/en/ch4#sec_storage_column)
- (see also column-oriented storage)
- node (in graphs) (see vertices)
- nodes (processes), [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed), [Glossary](/en/glossary)
- handling outages in leader-based replication, [Handling Node Outages](/en/ch6#sec_replication_failover)
- system models for failure, [System Model and Reality](/en/ch9#sec_distributed_system_model)
- noisy neighbors, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- nonblocking atomic commit, [Three-phase commit](/en/ch8#three-phase-commit)
- nondeterministic operations, [Statement-based replication](/en/ch6#statement-based-replication)
- (see also deterministic operations)
- in distributed systems, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- in workflow engines, [Durable execution](/en/ch5#durable-execution)
- partial failures, [Faults and Partial Failures](/en/ch9#sec_distributed_partial_failure)
- sources of nondeterminism, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- nonfunctional requirements, [Defining Nonfunctional Requirements](/en/ch2#ch_nonfunctional), [Summary](/en/ch2#summary)
- nonrepeatable reads, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)
- (see also read skew)
- normalization (data representation), [Normalization, Denormalization, and Joins](/en/ch3#sec_datamodels_normalization)-[Many-to-One and Many-to-Many Relationships](/en/ch3#sec_datamodels_many_to_many), [Glossary](/en/glossary)
- foreign key references, [The need for multi-object transactions](/en/ch8#sec_transactions_need)
- in social network case study, [Denormalization in the social networking case study](/en/ch3#denormalization-in-the-social-networking-case-study)
- in systems of record, [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived)
- versus denormalization, [Deriving several views from the same event log](/en/ch12#sec_stream_deriving_views)
- NoSQL, [Relational Model versus Document Model](/en/ch3#sec_datamodels_history), [Solutions for Replication Lag](/en/ch6#id131), [Unbundling Databases](/en/ch13#sec_future_unbundling)
- transactions and, [What Exactly Is a Transaction?](/en/ch8#sec_transactions_overview)
- Notation3 (N3), [Triple-Stores and SPARQL](/en/ch3#id59)
- NTP (Network Time Protocol), [Unreliable Clocks](/en/ch9#sec_distributed_clocks)
- accuracy, [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy), [Timestamps for ordering events](/en/ch9#sec_distributed_lww)
- adjustments to monotonic clocks, [Monotonic clocks](/en/ch9#monotonic-clocks)
- multiple server addresses, [Weak forms of lying](/en/ch9#weak-forms-of-lying)
- numbers, in XML and JSON encodings, [JSON, XML, and Binary Variants](/en/ch5#sec_encoding_json)
- NumPy (Python library), [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes), [Column-Oriented Storage](/en/ch4#sec_storage_column)
- NVMe (Non-Volatile Memory Express) (see solid state drives (SSDs))
### O
- object databases, [Relational Model versus Document Model](/en/ch3#sec_datamodels_history)
- object storage, [Layering of cloud services](/en/ch1#layering-of-cloud-services), [Object Stores](/en/ch11#id277)-[Object Stores](/en/ch11#id277)
- Azure Blob Storage (see Azure Blob Storage)
- comparison to distributed filesystems, [Object Stores](/en/ch11#id277)
- comparison to key-value stores, [Object Stores](/en/ch11#id277)
- databases backed by, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- for backups, [Replication](/en/ch6#ch_replication)
- for cloud data warehouses, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses), [Writing to Column-Oriented Storage](/en/ch4#writing-to-column-oriented-storage)
- for database replication, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- Google Cloud Storage (see Google Cloud Storage)
- object size, [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute)
- S3 (see S3 (object storage))
- storing LSM segment files, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- support for fencing, [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens)
- use in data lakes, [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake)
- object-relational mapping (ORM) frameworks, [Object-relational mapping (ORM)](/en/ch3#object-relational-mapping-orm)
- error handling and aborted transactions, [Handling errors and aborts](/en/ch8#handling-errors-and-aborts)
- unsafe read-modify-write cycle code, [Atomic write operations](/en/ch8#atomic-write-operations)
- object-relational mismatch, [The Object-Relational Mismatch](/en/ch3#sec_datamodels_document)
- observability, [Problems with Distributed Systems](/en/ch1#sec_introduction_dist_sys_problems), [Humans and Reliability](/en/ch2#id31), [Operability: Making Life Easy for Operations](/en/ch2#id37)
- observer pattern, [Separation of application code and state](/en/ch13#id344)
- OBT (one big table), [Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics), [Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)
- offline systems, [Batch Processing](/en/ch11#ch_batch)
- (see also batch processing)
- offline-first applications, [Real-time collaboration, offline-first, and local-first apps](/en/ch6#real-time-collaboration-offline-first-and-local-first-apps), [Stateful, offline-capable clients](/en/ch13#id347)
- offsets
- consumer offsets in sharded logs, [Consumer offsets](/en/ch12#sec_stream_log_offsets)
- messages in sharded logs, [Using logs for message storage](/en/ch12#id300)
- OLAP (online analytic processing), [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp), [Glossary](/en/glossary)
- data cubes, [Materialized Views and Data Cubes](/en/ch4#sec_storage_materialized_views)
- OLTP (online transaction processing), [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp), [Glossary](/en/glossary)
- analytics queries versus, [Analytics](/en/ch11#sec_batch_olap)
- data normalization, [Trade-offs of normalization](/en/ch3#trade-offs-of-normalization)
- workload characteristics, [Actual Serial Execution](/en/ch8#sec_transactions_serial)
- on-premises deployment, [Cloud Versus Self-Hosting](/en/ch1#sec_introduction_cloud)
- data warehouses, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- one big table (data warehouse schema), [Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics), [Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)
- one-hot encoding, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- one-to-few relationships, [The document data model for one-to-many relationships](/en/ch3#the-document-data-model-for-one-to-many-relationships)
- one-to-many relationships, [The document data model for one-to-many relationships](/en/ch3#the-document-data-model-for-one-to-many-relationships)
- JSON representation, [The document data model for one-to-many relationships](/en/ch3#the-document-data-model-for-one-to-many-relationships)
- online systems, [Batch Processing](/en/ch11#ch_batch)
- (see also services)
- versus scientific computing, [Cloud Computing Versus Supercomputing](/en/ch1#id17)
- ontologies, [Triple-Stores and SPARQL](/en/ch3#id59)
- Oozie (workflow scheduler), [Batch Processing](/en/ch11#ch_batch)
- OpenAPI (service definition format), [Microservices and Serverless](/en/ch1#sec_introduction_microservices), [Web services](/en/ch5#sec_web_services), [Web services](/en/ch5#sec_web_services)
- use of JSON Schema, [JSON Schema](/en/ch5#json-schema)
- openCypher (see Cypher (query language))
- OpenLink Virtuoso (see Virtuoso (database))
- OpenStack
- Swift (object storage), [Object Stores](/en/ch11#id277)
- operability, [Operability: Making Life Easy for Operations](/en/ch2#id37)
- operating systems versus databases, [Unbundling Databases](/en/ch13#sec_future_unbundling)
- operational systems, [Operational Versus Analytical Systems](/en/ch1#sec_introduction_analytics)
- (see also OLTP)
- as systems of record, [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived)
- ETL into analytical systems, [Data Warehousing](/en/ch1#sec_introduction_dwh)
- operational transformation, [CRDTs and Operational Transformation](/en/ch6#sec_replication_crdts)
- operations teams, [Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
- operators (query execution), [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- in stream processing, [Processing Streams](/en/ch12#sec_stream_processing)
- optimistic concurrency control, [Pessimistic versus optimistic concurrency control](/en/ch8#pessimistic-versus-optimistic-concurrency-control)
- optimistic locking, [Conditional writes (compare-and-set)](/en/ch8#sec_transactions_compare_and_set)
- Oracle (database)
- distributed transaction support, [XA transactions](/en/ch8#xa-transactions)
- GoldenGate (change data capture), [Implementing change data capture](/en/ch12#id307)
- hierarchical queries, [Graph Queries in SQL](/en/ch3#id58), [Graph Queries in SQL](/en/ch3#id58)
- lack of serializability, [Isolation](/en/ch8#sec_transactions_acid_isolation)
- leader-based replication, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- multi-leader replication, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)
- multi-table index cluster tables, [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality)
- not preventing write skew, [Characterizing write skew](/en/ch8#characterizing-write-skew)
- PL/SQL language, [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs)
- preventing lost updates, [Automatically detecting lost updates](/en/ch8#automatically-detecting-lost-updates)
- read committed isolation, [Implementing read committed](/en/ch8#sec_transactions_read_committed_impl)
- Real Application Clusters (RAC), [Locking and leader election](/en/ch10#locking-and-leader-election)
- snapshot isolation support, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation), [Snapshot isolation, repeatable read, and naming confusion](/en/ch8#snapshot-isolation-repeatable-read-and-naming-confusion)
- TimesTen (in-memory database), [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- WAL-based replication, [Write-ahead log (WAL) shipping](/en/ch6#write-ahead-log-wal-shipping)
- ORC (data format), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses), [Column-Oriented Storage](/en/ch4#sec_storage_column)
- (see also column-oriented storage)
- orchestration (service deployment), [Cloud Versus Self-Hosting](/en/ch1#sec_introduction_cloud), [Microservices and Serverless](/en/ch1#sec_introduction_microservices)
- batch job execution, [Distributed Job Orchestration](/en/ch11#id278)-[Distributed Job Orchestration](/en/ch11#id278)
- workflow engines, [Batch Processing](/en/ch11#ch_batch)
- ordering
- event logs, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- limits of total ordering, [The limits of total ordering](/en/ch13#id335)
- logical timestamps, [Logical Clocks](/en/ch10#sec_consistency_timestamps)
- of auto-incrementing IDs, [ID Generators and Logical Clocks](/en/ch10#sec_consistency_logical)
- shared logs, [Consensus in Practice](/en/ch10#sec_consistency_total_order)-[Pros and cons of consensus](/en/ch10#pros-and-cons-of-consensus)
- Orkes (workflow engine), [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows)
- orphan pages (B-trees), [Making B-trees reliable](/en/ch4#sec_storage_btree_wal)
- outbox pattern, [Change data capture versus event sourcing](/en/ch12#sec_stream_event_sourcing)
- outliers (response time), [Average, Median, and Percentiles](/en/ch2#id24)
- outsourcing, [Cloud Versus Self-Hosting](/en/ch1#sec_introduction_cloud)
- overload, [Describing Performance](/en/ch2#sec_introduction_percentiles), [Handling errors and aborts](/en/ch8#handling-errors-and-aborts)
### P
- PACELC principle, [The CAP theorem](/en/ch10#the-cap-theorem)
- package managers, [Separation of application code and state](/en/ch13#id344)
- packet switching, [Can we not simply make network delays predictable?](/en/ch9#can-we-not-simply-make-network-delays-predictable)
- packets
- corruption of, [Weak forms of lying](/en/ch9#weak-forms-of-lying)
- sending via UDP, [Direct messaging from producers to consumers](/en/ch12#id296)
- PageRank (algorithm), [Graph-Like Data Models](/en/ch3#sec_datamodels_graph), [Query languages](/en/ch11#sec_batch_query_lanauges), [Machine Learning](/en/ch11#id290)
- paging (see virtual memory)
- pandas (Python library), [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake), [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes), [Column-Oriented Storage](/en/ch4#sec_storage_column), [DataFrames](/en/ch11#id287)
- Parquet (data format), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses), [Column-Oriented Storage](/en/ch4#sec_storage_column), [Archival storage](/en/ch5#archival-storage), [Query languages](/en/ch11#sec_batch_query_lanauges)
- (see also column-oriented storage)
- databases on object storage, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- document data model, [Column-Oriented Storage](/en/ch4#sec_storage_column)
- use in batch processing, [MapReduce](/en/ch11#sec_batch_mapreduce)
- partial failures, [Faults and Partial Failures](/en/ch9#sec_distributed_partial_failure), [Summary](/en/ch9#summary)
- limping, [System Model and Reality](/en/ch9#sec_distributed_system_model)
- partial synchrony (system model), [System Model and Reality](/en/ch9#sec_distributed_system_model)
- partition key, [Pros and Cons of Sharding](/en/ch7#sec_sharding_reasons), [Sharding of Key-Value Data](/en/ch7#sec_sharding_key_value)
- partitioning (see sharding)
- Paxos (consensus algorithm), [Consensus](/en/ch10#sec_consistency_consensus), [Consensus in Practice](/en/ch10#sec_consistency_total_order)
- ballot number, [From single-leader replication to consensus](/en/ch10#from-single-leader-replication-to-consensus)
- Multi-Paxos, [Consensus in Practice](/en/ch10#sec_consistency_total_order)
- payment card industry (PCI), [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance)
- PCI (payment card industry) compliance, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance)
- percentiles, [Average, Median, and Percentiles](/en/ch2#id24), [Glossary](/en/glossary)
- calculating efficiently, [Use of Response Time Metrics](/en/ch2#sec_introduction_slo_sla)
- importance of high percentiles, [Use of Response Time Metrics](/en/ch2#sec_introduction_slo_sla)
- use in service level agreements (SLAs), [Use of Response Time Metrics](/en/ch2#sec_introduction_slo_sla)
- Percolator (Google), [Implementing a linearizable ID generator](/en/ch10#implementing-a-linearizable-id-generator)
- Percona XtraBackup (MySQL tool), [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- performance
- degradation as fault, [System Model and Reality](/en/ch9#sec_distributed_system_model)
- describing, [Describing Performance](/en/ch2#sec_introduction_percentiles)
- of distributed transactions, [Distributed Transactions Across Different Systems](/en/ch8#sec_transactions_xa)
- of in-memory databases, [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- of linearizability, [Linearizability and network delays](/en/ch10#linearizability-and-network-delays)
- of multi-leader replication, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)
- permission isolation, [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- perpetual inconsistency, [Timeliness and Integrity](/en/ch13#sec_future_integrity)
- pessimistic concurrency control, [Pessimistic versus optimistic concurrency control](/en/ch8#pessimistic-versus-optimistic-concurrency-control)
- pglogical (PostgreSQL extension), [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)
- pgvector (vector index), [Vector Embeddings](/en/ch4#id92)
- phantoms (transaction isolation), [Phantoms causing write skew](/en/ch8#sec_transactions_phantom)
- materializing conflicts, [Materializing conflicts](/en/ch8#materializing-conflicts)
- preventing, in serializability, [Predicate locks](/en/ch8#predicate-locks)
- physical clocks (see clocks)
- pickle (Python), [Language-Specific Formats](/en/ch5#id96)
- Pinot (database), [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp), [Column-Oriented Storage](/en/ch4#sec_storage_column)
- handling writes, [Writing to Column-Oriented Storage](/en/ch4#writing-to-column-oriented-storage)
- pre-aggregation, [Analytics](/en/ch11#sec_batch_olap)
- serving derived data, [Serving Derived Data](/en/ch11#sec_batch_serving_derived), [Serving Derived Data](/en/ch11#sec_batch_serving_derived)
- pipelined execution
- in data warehouse queries, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- pivot table, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- point in time, [Unreliable Clocks](/en/ch9#sec_distributed_clocks)
- point query, [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp)
- Polaris (data catalog), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- polling, [Representing Users, Posts, and Follows](/en/ch2#id20)
- polystores, [The meta-database of everything](/en/ch13#id341)
- POSIX (portable operating system interface)
- compliant filesystems, [Setting Up New Followers](/en/ch6#sec_replication_new_replica), [Distributed Filesystems](/en/ch11#sec_batch_dfs), [Object Stores](/en/ch11#id277)
- Post Office Horizon scandal, [Humans and Reliability](/en/ch2#id31)
- lack of transactions, [Transactions](/en/ch8#ch_transactions)
- PostgreSQL (database)
- archiving WAL to object stores, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- change data capture, [Implementing change data capture](/en/ch12#id307), [API support for change streams](/en/ch12#sec_stream_change_api)
- distributed transaction support, [XA transactions](/en/ch8#xa-transactions)
- foreign data wrappers, [The meta-database of everything](/en/ch13#id341)
- full text search support, [Combining Specialized Tools by Deriving Data](/en/ch13#id442)
- in the cloud, [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native)
- JSON Schema validation, [JSON Schema](/en/ch5#json-schema)
- leader-based replication, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- log sequence number, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- logical decoding, [Logical (row-based) log replication](/en/ch6#logical-row-based-log-replication)
- materialized view maintenance, [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- multi-leader replication, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)
- MVCC implementation, [Multi-version concurrency control (MVCC)](/en/ch8#sec_transactions_snapshot_impl), [Indexes and snapshot isolation](/en/ch8#indexes-and-snapshot-isolation)
- partitioning vs. sharding, [Sharding](/en/ch7#ch_sharding)
- pgvector (vector index), [Vector Embeddings](/en/ch4#id92)
- PL/pgSQL language, [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs)
- PostGIS geospatial indexes, [Multidimensional and Full-Text Indexes](/en/ch4#sec_storage_multidimensional)
- preventing lost updates, [Automatically detecting lost updates](/en/ch8#automatically-detecting-lost-updates)
- preventing write skew, [Characterizing write skew](/en/ch8#characterizing-write-skew), [Serializable Snapshot Isolation (SSI)](/en/ch8#sec_transactions_ssi)
- read committed isolation, [Implementing read committed](/en/ch8#sec_transactions_read_committed_impl)
- representing graphs, [Property Graphs](/en/ch3#id56)
- serializable snapshot isolation (SSI), [Serializable Snapshot Isolation (SSI)](/en/ch8#sec_transactions_ssi)
- sharding (see Citus (database))
- snapshot isolation support, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation), [Snapshot isolation, repeatable read, and naming confusion](/en/ch8#snapshot-isolation-repeatable-read-and-naming-confusion)
- WAL-based replication, [Write-ahead log (WAL) shipping](/en/ch6#write-ahead-log-wal-shipping)
- postings list, [Full-Text Search](/en/ch4#sec_storage_full_text)
- in sharded indexes, [Local Secondary Indexes](/en/ch7#id166)
- postmortems, blameless, [Humans and Reliability](/en/ch2#id31)
- PouchDB (database), [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- Power BI (business intelligence software), [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp), [Analytics](/en/ch11#sec_batch_olap)
- pre-aggregation, [Analytics](/en/ch11#sec_batch_olap)
- serving derived data, [Serving Derived Data](/en/ch11#sec_batch_serving_derived)
- pre-splitting, [Rebalancing key-range sharded data](/en/ch7#rebalancing-key-range-sharded-data)
- Precision Time Protocol (PTP), [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy)
- predicate locks, [Predicate locks](/en/ch8#predicate-locks)
- predictive analytics, [Operational Versus Analytical Systems](/en/ch1#sec_introduction_analytics), [Predictive Analytics](/en/ch14#id369)-[Feedback Loops](/en/ch14#id372)
- amplifying bias, [Bias and Discrimination](/en/ch14#id370)
- ethics of (see ethics)
- feedback loops, [Feedback Loops](/en/ch14#id372)
- preemption, [Resource Allocation](/en/ch11#id279)
- in distributed schedulers, [Handling Faults](/en/ch11#id281)
- of threads, [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- Prefect (workflow scheduler), [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows), [Batch Processing](/en/ch11#ch_batch), [Scheduling Workflows](/en/ch11#sec_batch_workflows)
- cloud data warehouse integration, [Query languages](/en/ch11#sec_batch_query_lanauges)
- Presto (query engine), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- primary keys, [Multi-Column and Secondary Indexes](/en/ch4#sec_storage_index_multicolumn), [Glossary](/en/glossary)
- auto-incrementing, [ID Generators and Logical Clocks](/en/ch10#sec_consistency_logical)
- versus partition key, [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- primary-backup replication (see leader-based replication)
- privacy, [Privacy and Tracking](/en/ch14#id373)-[Legislation and Self-Regulation](/en/ch14#sec_future_legislation)
- consent and freedom of choice, [Consent and Freedom of Choice](/en/ch14#id375)
- data as assets and power, [Data as Assets and Power](/en/ch14#id376)
- deleting data, [Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- ethical considerations (see ethics)
- legislation and self-regulation, [Legislation and Self-Regulation](/en/ch14#sec_future_legislation)
- meaning of, [Privacy and Use of Data](/en/ch14#id457)
- regulation, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance)
- surveillance, [Surveillance](/en/ch14#id374)
- tracking behavioral data, [Privacy and Tracking](/en/ch14#id373)
- probabilistic algorithms, [Use of Response Time Metrics](/en/ch2#sec_introduction_slo_sla), [Stream analytics](/en/ch12#id318)
- process pauses, [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)-[Limiting the impact of garbage collection](/en/ch9#sec_distributed_gc_impact)
- processing time (of events), [Reasoning About Time](/en/ch12#sec_stream_time)
- producers (message streams), [Transmitting Event Streams](/en/ch12#sec_stream_transmit)
- product analytics, [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp)
- column-oriented storage, [Column-Oriented Storage](/en/ch4#sec_storage_column)
- programming languages
- for stored procedures, [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs)
- projections (event sourcing), [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- Prolog (language), [Datalog: Recursive Relational Queries](/en/ch3#id62)
- (see also Datalog)
- property graphs, [Property Graphs](/en/ch3#id56)
- Cypher query language, [The Cypher Query Language](/en/ch3#id57)
- Property Graph Query Language (PGQL), [Graph Queries in SQL](/en/ch3#id58)
- property-based testing, [Humans and Reliability](/en/ch2#id31), [Formal Methods and Randomized Testing](/en/ch9#sec_distributed_formal)
- Protocol Buffers (data format), [Protocol Buffers](/en/ch5#sec_encoding_protobuf)-[Field tags and schema evolution](/en/ch5#field-tags-and-schema-evolution), [Protocol Buffers](/en/ch5#sec_encoding_protobuf)
- field tags and schema evolution, [Field tags and schema evolution](/en/ch5#field-tags-and-schema-evolution)
- provenance of data, [Designing for auditability](/en/ch13#id365)
- publish/subscribe model, [Messaging Systems](/en/ch12#sec_stream_messaging)
- publishers (message streams), [Transmitting Event Streams](/en/ch12#sec_stream_transmit)
- Pulsar (streaming platform), [Acknowledgments and redelivery](/en/ch12#sec_stream_reordering)
- PyTorch (machine learning library), [Machine Learning](/en/ch11#id290)
### Q
- Qpid (messaging), [Message brokers compared to databases](/en/ch12#id297)
- quality of service (QoS), [Can we not simply make network delays predictable?](/en/ch9#can-we-not-simply-make-network-delays-predictable)
- Quantcast File System (distributed filesystem), [Object Stores](/en/ch11#id277)
- query engines
- compilation and vectorization, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- in cloud data warehouse, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- operators, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- optimizing declarative queries, [Data Models and Query Languages](/en/ch3#ch_datamodels)
- query languages
- Cypher, [The Cypher Query Language](/en/ch3#id57)
- Datalog, [Datalog: Recursive Relational Queries](/en/ch3#id62)
- GraphQL, [GraphQL](/en/ch3#id63)
- MongoDB aggregation pipeline, [Normalization, Denormalization, and Joins](/en/ch3#sec_datamodels_normalization), [Query languages for documents](/en/ch3#query-languages-for-documents)
- recursive SQL queries, [Graph Queries in SQL](/en/ch3#id58)
- SPARQL, [The SPARQL query language](/en/ch3#the-sparql-query-language)
- SQL, [Normalization, Denormalization, and Joins](/en/ch3#sec_datamodels_normalization)
- query optimizers, [Query languages](/en/ch11#sec_batch_query_lanauges)
- query plans, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- queueing delays, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- head-of-line blocking, [Latency and Response Time](/en/ch2#id23)
- latency and response time, [Latency and Response Time](/en/ch2#id23)
- queues (messaging), [Message brokers](/en/ch5#message-brokers)
- QUIC (protocol), [The Limitations of TCP](/en/ch9#sec_distributed_tcp)
- quorums, [Quorums for reading and writing](/en/ch6#sec_replication_quorum_condition)-[Multi-region operation](/en/ch6#multi-region-operation), [Glossary](/en/glossary)
- for leaderless replication, [Quorums for reading and writing](/en/ch6#sec_replication_quorum_condition)
- in consensus algorithms, [From single-leader replication to consensus](/en/ch10#from-single-leader-replication-to-consensus)
- limitations of consistency, [Limitations of Quorum Consistency](/en/ch6#sec_replication_quorum_limitations)-[Monitoring staleness](/en/ch6#monitoring-staleness), [Linearizability and quorums](/en/ch10#sec_consistency_quorum_linearizable)
- making decisions in distributed systems, [The Majority Rules](/en/ch9#sec_distributed_majority)
- monitoring staleness, [Monitoring staleness](/en/ch6#monitoring-staleness)
- multi-region replication, [Multi-region operation](/en/ch6#multi-region-operation)
- relying on durability, [Mapping system models to the real world](/en/ch9#mapping-system-models-to-the-real-world)
- quotas, [Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
### R
- R (language), [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake), [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes), [DataFrames](/en/ch11#id287)
- R-trees (indexes), [Multidimensional and Full-Text Indexes](/en/ch4#sec_storage_multidimensional)
- R2 (object storage), [Layering of cloud services](/en/ch1#layering-of-cloud-services), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- RabbitMQ (messaging), [Message brokers](/en/ch5#message-brokers), [Message brokers compared to databases](/en/ch12#id297)
- quorum queues (replication), [Single-Leader Replication](/en/ch6#sec_replication_leader)
- race conditions, [Isolation](/en/ch8#sec_transactions_acid_isolation)
- (see also concurrency)
- avoiding with linearizability, [Cross-channel timing dependencies](/en/ch10#cross-channel-timing-dependencies)
- caused by dual writes, [Keeping Systems in Sync](/en/ch12#sec_stream_sync)
- causing loss of money, [Weak Isolation Levels](/en/ch8#sec_transactions_isolation_levels)
- dirty writes, [No dirty writes](/en/ch8#sec_transactions_dirty_write)
- in counter increments, [No dirty writes](/en/ch8#sec_transactions_dirty_write)
- lost updates, [Preventing Lost Updates](/en/ch8#sec_transactions_lost_update)-[Conflict resolution and replication](/en/ch8#conflict-resolution-and-replication)
- preventing with event logs, [Concurrency control](/en/ch12#sec_stream_concurrency), [Dataflow: Interplay between state changes and application code](/en/ch13#id450)
- preventing with serializable isolation, [Serializability](/en/ch8#sec_transactions_serializability)
- weak transaction isolation, [Weak Isolation Levels](/en/ch8#sec_transactions_isolation_levels)
- write skew, [Write Skew and Phantoms](/en/ch8#sec_transactions_write_skew)-[Materializing conflicts](/en/ch8#materializing-conflicts)
- Raft (consensus algorithm), [Consensus](/en/ch10#sec_consistency_consensus), [Consensus in Practice](/en/ch10#sec_consistency_total_order)
- leader-based replication, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- sensitivity to network problems, [Pros and cons of consensus](/en/ch10#pros-and-cons-of-consensus)
- term number, [From single-leader replication to consensus](/en/ch10#from-single-leader-replication-to-consensus)
- use in etcd, [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable)
- RAID (Redundant Array of Independent Disks), [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute), [Tolerating hardware faults through redundancy](/en/ch2#tolerating-hardware-faults-through-redundancy), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- railways, schema migration on, [Reprocessing data for application evolution](/en/ch13#sec_future_reprocessing)
- RAM (see memory)
- RAMCloud (in-memory storage), [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- random writes (access pattern), [Sequential versus random writes](/en/ch4#sidebar_sequential)
- range queries
- in B-trees, [B-Trees](/en/ch4#sec_storage_b_trees), [Read performance](/en/ch4#read-performance)
- in LSM-trees, [Read performance](/en/ch4#read-performance)
- not efficient in hash maps, [Log-Structured Storage](/en/ch4#sec_storage_log_structured)
- with hash sharding, [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- ranking algorithms, [Machine Learning](/en/ch11#id290)
- Ray (workflow scheduler), [Machine Learning](/en/ch11#id290)
- RDF (Resource Description Framework), [The RDF data model](/en/ch3#the-rdf-data-model)
- querying with SPARQL, [The SPARQL query language](/en/ch3#the-sparql-query-language)
- RDMA (Remote Direct Memory Access), [Layering of cloud services](/en/ch1#layering-of-cloud-services), [Cloud Computing Versus Supercomputing](/en/ch1#id17)
- React (user interface library), [End-to-end event streams](/en/ch13#id349)
- reactive programming, [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- read committed isolation level, [Read Committed](/en/ch8#sec_transactions_read_committed)-[Implementing read committed](/en/ch8#sec_transactions_read_committed_impl)
- implementing, [Implementing read committed](/en/ch8#sec_transactions_read_committed_impl)
- multi-version concurrency control (MVCC), [Multi-version concurrency control (MVCC)](/en/ch8#sec_transactions_snapshot_impl)
- no dirty reads, [No dirty reads](/en/ch8#no-dirty-reads)
- no dirty writes, [No dirty writes](/en/ch8#sec_transactions_dirty_write)
- read models (event sourcing), [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- read path (derived data), [Observing Derived State](/en/ch13#sec_future_observing)
- read repair (leaderless replication), [Catching up on missed writes](/en/ch6#sec_replication_read_repair)
- for linearizability, [Linearizability and quorums](/en/ch10#sec_consistency_quorum_linearizable)
- read replicas (see leader-based replication)
- read skew (transaction isolation), [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation), [Summary](/en/ch8#summary)
- read uncommitted isolation level, [Implementing read committed](/en/ch8#sec_transactions_read_committed_impl)
- read-after-write consistency, [Reading Your Own Writes](/en/ch6#sec_replication_ryw), [Timeliness and Integrity](/en/ch13#sec_future_integrity)
- cross-device, [Reading Your Own Writes](/en/ch6#sec_replication_ryw)
- in derived data systems, [Derived data versus distributed transactions](/en/ch13#sec_future_derived_vs_transactions)
- read-modify-write cycle, [Preventing Lost Updates](/en/ch8#sec_transactions_lost_update)
- read-scaling architecture, [Problems with Replication Lag](/en/ch6#sec_replication_lag), [Single-Leader Versus Leaderless Replication Performance](/en/ch6#sec_replication_leaderless_perf)
- versus sharding, [Pros and Cons of Sharding](/en/ch7#sec_sharding_reasons)
- reads as events, [Reads are events too](/en/ch13#sec_future_read_events)
- real-time
- analytics (see product analytics)
- collaborative editing, [Real-time collaboration, offline-first, and local-first apps](/en/ch6#real-time-collaboration-offline-first-and-local-first-apps)
- publish/subscribe dataflow, [End-to-end event streams](/en/ch13#id349)
- response time guarantees, [Response time guarantees](/en/ch9#sec_distributed_clocks_realtime)
- time-of-day clocks, [Time-of-day clocks](/en/ch9#time-of-day-clocks)
- Realm (database), [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- rebalancing shards, [Rebalancing key-range sharded data](/en/ch7#rebalancing-key-range-sharded-data)-[Operations: Automatic or Manual Rebalancing](/en/ch7#sec_sharding_operations), [Glossary](/en/glossary)
- (see also sharding)
- automatic or manual rebalancing, [Operations: Automatic or Manual Rebalancing](/en/ch7#sec_sharding_operations)
- fixed number of shards, [Fixed number of shards](/en/ch7#fixed-number-of-shards)
- fixed number of shards per node, [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- problems with hash mod N, [Hash modulo number of nodes](/en/ch7#hash-modulo-number-of-nodes)
- recency guarantee, [Linearizability](/en/ch10#sec_consistency_linearizability)
- recommendation engines, [Operational Versus Analytical Systems](/en/ch1#sec_introduction_analytics)
- building using DataFrames, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- iterative processing, [Machine Learning](/en/ch11#id290)
- reconfiguration (consensus), [Subtleties of consensus](/en/ch10#subtleties-of-consensus)
- records, [MapReduce](/en/ch11#sec_batch_mapreduce)
- events in stream processing, [Transmitting Event Streams](/en/ch12#sec_stream_transmit)
- recursive queries
- in Cypher, [The Cypher Query Language](/en/ch3#id57)
- in Datalog, [Datalog: Recursive Relational Queries](/en/ch3#id62)
- in SPARQL, [The SPARQL query language](/en/ch3#the-sparql-query-language)
- lack of, in GraphQL, [GraphQL](/en/ch3#id63)
- SQL common table expressions, [Graph Queries in SQL](/en/ch3#id58)
- Red Hat
- Apicurio Registry, [JSON Schema](/en/ch5#json-schema)
- red-black tree, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- redelivery (messaging), [Acknowledgments and redelivery](/en/ch12#sec_stream_reordering)
- Redis (database)
- atomic operations, [Atomic write operations](/en/ch8#atomic-write-operations)
- CRDT support, [CRDTs and Operational Transformation](/en/ch6#sec_replication_crdts)
- durability, [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- Lua scripting, [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs)
- multi-leader replication, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)
- process-per-core model, [Pros and Cons of Sharding](/en/ch7#sec_sharding_reasons)
- single-threaded execution, [Actual Serial Execution](/en/ch8#sec_transactions_serial)
- redo log (see write-ahead log)
- Redpanda (messaging), [Message brokers](/en/ch5#message-brokers), [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- tiered storage, [Disk space usage](/en/ch12#sec_stream_disk_usage)
- Redshift (database), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- redundancy
- hardware components, [Tolerating hardware faults through redundancy](/en/ch2#tolerating-hardware-faults-through-redundancy)
- of derived data, [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived)
- (see also derived data)
- Reed--Solomon codes (error correction), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- refactoring, [Evolvability: Making Change Easy](/en/ch2#sec_introduction_evolvability)
- (see also evolvability)
- regions (geographic distribution), [Reading Your Own Writes](/en/ch6#sec_replication_ryw)
- (see also datacenters)
- consensus across, [Pros and cons of consensus](/en/ch10#pros-and-cons-of-consensus)
- definition, [Reading Your Own Writes](/en/ch6#sec_replication_ryw)
- latency, [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed)
- linearizable ID generation, [Implementing a linearizable ID generator](/en/ch10#implementing-a-linearizable-id-generator)
- replication across, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)-[Problems with different topologies](/en/ch6#problems-with-different-topologies), [The Cost of Linearizability](/en/ch10#sec_linearizability_cost), [The limits of total ordering](/en/ch13#id335)
- leaderless, [Multi-region operation](/en/ch6#multi-region-operation)
- multi-leader, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)
- regions (sharding), [Sharding](/en/ch7#ch_sharding)
- register (data structure), [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- regulation (see legal matters)
- relational data model, [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake), [Relational Model versus Document Model](/en/ch3#sec_datamodels_history)-[Convergence of document and relational databases](/en/ch3#convergence-of-document-and-relational-databases)
- comparison to document model, [When to Use Which Model](/en/ch3#sec_datamodels_document_summary)-[Convergence of document and relational databases](/en/ch3#convergence-of-document-and-relational-databases)
- graph queries in SQL, [Graph Queries in SQL](/en/ch3#id58)
- in-memory databases with, [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- many-to-one and many-to-many relationships, [Many-to-One and Many-to-Many Relationships](/en/ch3#sec_datamodels_many_to_many)
- multi-object transactions, need for, [The need for multi-object transactions](/en/ch8#sec_transactions_need)
- object-relational mismatch, [The Object-Relational Mismatch](/en/ch3#sec_datamodels_document)
- representing a reorderable list, [When to Use Which Model](/en/ch3#sec_datamodels_document_summary)
- versus document model
- convergence of models, [Convergence of document and relational databases](/en/ch3#convergence-of-document-and-relational-databases)
- data locality, [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality)
- relational databases
- eventual consistency, [Problems with Replication Lag](/en/ch6#sec_replication_lag)
- history, [Relational Model versus Document Model](/en/ch3#sec_datamodels_history)
- leader-based replication, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- logical logs, [Logical (row-based) log replication](/en/ch6#logical-row-based-log-replication)
- philosophy compared to Unix, [Unbundling Databases](/en/ch13#sec_future_unbundling), [The meta-database of everything](/en/ch13#id341)
- schema changes, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility), [Encoding and Evolution](/en/ch5#ch_encoding), [Different values written at different times](/en/ch5#different-values-written-at-different-times)
- sharded secondary indexes, [Sharding and Secondary Indexes](/en/ch7#sec_sharding_secondary_indexes)
- statement-based replication, [Statement-based replication](/en/ch6#statement-based-replication)
- use of B-tree indexes, [B-Trees](/en/ch4#sec_storage_b_trees)
- relationships (see edges)
- reliability, [Reliability and Fault Tolerance](/en/ch2#sec_introduction_reliability)-[Humans and Reliability](/en/ch2#id31), [A Philosophy of Streaming Systems](/en/ch13#ch_philosophy)
- building a reliable system from unreliable components, [Faults and Partial Failures](/en/ch9#sec_distributed_partial_failure)
- hardware faults, [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults)
- human errors, [Humans and Reliability](/en/ch2#id31)
- importance of, [Humans and Reliability](/en/ch2#id31)
- of messaging systems, [Messaging Systems](/en/ch12#sec_stream_messaging)
- software faults, [Software faults](/en/ch2#software-faults)
- Remote Method Invocation (Java RMI), [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc)
- remote procedure calls (RPCs), [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc)-[Data encoding and evolution for RPC](/en/ch5#data-encoding-and-evolution-for-rpc)
- (see also services)
- data encoding and evolution, [Data encoding and evolution for RPC](/en/ch5#data-encoding-and-evolution-for-rpc)
- issues with, [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc)
- using Avro, [But what is the writer's schema?](/en/ch5#but-what-is-the-writers-schema)
- versus message brokers, [Event-Driven Architectures](/en/ch5#sec_encoding_dataflow_msg)
- renewable energy, [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed)
- repeatable reads (transaction isolation), [Snapshot isolation, repeatable read, and naming confusion](/en/ch8#snapshot-isolation-repeatable-read-and-naming-confusion)
- replicas, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- replication, [Replication](/en/ch6#ch_replication)-[Summary](/en/ch6#summary), [Glossary](/en/glossary)
- and durability, [Durability](/en/ch8#durability)
- conflict resolution and, [Conflict resolution and replication](/en/ch8#conflict-resolution-and-replication)
- consistency properties, [Problems with Replication Lag](/en/ch6#sec_replication_lag)-[Solutions for Replication Lag](/en/ch6#id131)
- consistent prefix reads, [Consistent Prefix Reads](/en/ch6#sec_replication_consistent_prefix)
- monotonic reads, [Monotonic Reads](/en/ch6#sec_replication_monotonic_reads)
- reading your own writes, [Reading Your Own Writes](/en/ch6#sec_replication_ryw)
- in distributed filesystems, [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- leaderless, [Leaderless Replication](/en/ch6#sec_replication_leaderless)-[Version vectors](/en/ch6#version-vectors)
- detecting concurrent writes, [Detecting Concurrent Writes](/en/ch6#sec_replication_concurrent)-[Version vectors](/en/ch6#version-vectors)
- limitations of quorum consistency, [Limitations of Quorum Consistency](/en/ch6#sec_replication_quorum_limitations)-[Monitoring staleness](/en/ch6#monitoring-staleness), [Linearizability and quorums](/en/ch10#sec_consistency_quorum_linearizable)
- monitoring staleness, [Monitoring staleness](/en/ch6#monitoring-staleness)
- multi-leader, [Multi-Leader Replication](/en/ch6#sec_replication_multi_leader)-[Types of conflict](/en/ch6#sec_replication_write_conflicts)
- across multiple regions, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc), [The Cost of Linearizability](/en/ch10#sec_linearizability_cost)
- conflict resolution, [Dealing with Conflicting Writes](/en/ch6#sec_replication_write_conflicts)-[Types of conflict](/en/ch6#sec_replication_write_conflicts)
- replication topologies, [Multi-leader replication topologies](/en/ch6#sec_replication_topologies)-[Problems with different topologies](/en/ch6#problems-with-different-topologies)
- reasons for using, [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed), [Replication](/en/ch6#ch_replication)
- sharding and, [Sharding](/en/ch7#ch_sharding)
- single-leader, [Single-Leader Replication](/en/ch6#sec_replication_leader)-[Logical (row-based) log replication](/en/ch6#logical-row-based-log-replication)
- failover, [Leader failure: Failover](/en/ch6#leader-failure-failover)
- implementation of replication logs, [Implementation of Replication Logs](/en/ch6#sec_replication_implementation)-[Logical (row-based) log replication](/en/ch6#logical-row-based-log-replication)
- relation to consensus, [From single-leader replication to consensus](/en/ch10#from-single-leader-replication-to-consensus), [Pros and cons of consensus](/en/ch10#pros-and-cons-of-consensus)
- setting up new followers, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- synchronous versus asynchronous, [Synchronous Versus Asynchronous Replication](/en/ch6#sec_replication_sync_async)-[Synchronous Versus Asynchronous Replication](/en/ch6#sec_replication_sync_async)
- state machine replication, [Statement-based replication](/en/ch6#statement-based-replication), [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs), [Using shared logs](/en/ch10#sec_consistency_smr), [Databases and Streams](/en/ch12#sec_stream_databases)
- event sourcing, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- reliance on determinism, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- using consensus, [Pros and cons of consensus](/en/ch10#pros-and-cons-of-consensus)
- using erasure coding, [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- using object storage, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- versus backups, [Replication](/en/ch6#ch_replication)
- with heterogeneous data systems, [Keeping Systems in Sync](/en/ch12#sec_stream_sync)
- replication logs (see logs)
- representations of data (see data models)
- reprocessing data, [Reprocessing data for application evolution](/en/ch13#sec_future_reprocessing), [Unifying batch and stream processing](/en/ch13#id338)
- (see also evolvability)
- from log-based messaging, [Replaying old messages](/en/ch12#sec_stream_replay)
- request hedging, [Single-Leader Versus Leaderless Replication Performance](/en/ch6#sec_replication_leaderless_perf)
- request identifiers, [Uniquely identifying requests](/en/ch13#id355), [Multi-shard request processing](/en/ch13#id360)
- request routing, [Request Routing](/en/ch7#sec_sharding_routing)-[Request Routing](/en/ch7#sec_sharding_routing)
- approaches to, [Request Routing](/en/ch7#sec_sharding_routing)
- residence laws for data, [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed), [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- resilient systems, [Reliability and Fault Tolerance](/en/ch2#sec_introduction_reliability)
- (see also fault tolerance)
- resource isolation, [Cloud Computing Versus Supercomputing](/en/ch1#id17), [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- resource limits, [Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
- response time
- as performance metric, [Describing Performance](/en/ch2#sec_introduction_percentiles), [Batch Processing](/en/ch11#ch_batch)
- guarantees on, [Response time guarantees](/en/ch9#sec_distributed_clocks_realtime)
- impact on users, [Average, Median, and Percentiles](/en/ch2#id24)
- in replicated systems, [Single-Leader Versus Leaderless Replication Performance](/en/ch6#sec_replication_leaderless_perf)
- latency versus, [Latency and Response Time](/en/ch2#id23)
- mean and percentiles, [Average, Median, and Percentiles](/en/ch2#id24)
- user experience, [Average, Median, and Percentiles](/en/ch2#id24)
- responsibility and accountability, [Responsibility and Accountability](/en/ch14#id371)
- REST (Representational State Transfer), [Web services](/en/ch5#sec_web_services)
- (see also services)
- Restate (workflow engine), [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows)
- RethinkDB (database)
- join support, [Convergence of document and relational databases](/en/ch3#convergence-of-document-and-relational-databases)
- key-range sharding, [Sharding by Key Range](/en/ch7#sec_sharding_key_range)
- retry storm, [Describing Performance](/en/ch2#sec_introduction_percentiles), [Software faults](/en/ch2#software-faults)
- reverse ETL, [Beyond the data lake](/en/ch1#beyond-the-data-lake)
- Riak (database)
- CRDT support, [CRDTs and Operational Transformation](/en/ch6#sec_replication_crdts), [Detecting Concurrent Writes](/en/ch6#sec_replication_concurrent)
- dotted version vectors, [Version vectors](/en/ch6#version-vectors)
- gossip protocol, [Request Routing](/en/ch7#sec_sharding_routing)
- hash sharding, [Fixed number of shards](/en/ch7#fixed-number-of-shards)
- leaderless replication, [Leaderless Replication](/en/ch6#sec_replication_leaderless)
- linearizability, lack of, [Linearizability and quorums](/en/ch10#sec_consistency_quorum_linearizable)
- multi-region support, [Multi-region operation](/en/ch6#multi-region-operation)
- rebalancing, [Operations: Automatic or Manual Rebalancing](/en/ch7#sec_sharding_operations)
- secondary indexes, [Local Secondary Indexes](/en/ch7#id166)
- sloppy quorums, [Single-Leader Versus Leaderless Replication Performance](/en/ch6#sec_replication_leaderless_perf)
- vnodes (sharding), [Sharding](/en/ch7#ch_sharding)
- ring buffers, [Disk space usage](/en/ch12#sec_stream_disk_usage)
- RisingWave (database)
- incremental view maintenance, [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- rockets, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)
- RocksDB (storage engine), [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- as embedded storage engine, [Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- leveled compaction, [Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- serving derived data, [Serving Derived Data](/en/ch11#sec_batch_serving_derived)
- rollbacks (transactions), [Transactions](/en/ch8#ch_transactions)
- rolling upgrades, [Tolerating hardware faults through redundancy](/en/ch2#tolerating-hardware-faults-through-redundancy), [Encoding and Evolution](/en/ch5#ch_encoding), [Faults and Partial Failures](/en/ch9#sec_distributed_partial_failure)
- in a multitenant system, [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- routing (see request routing)
- row-based replication, [Logical (row-based) log replication](/en/ch6#logical-row-based-log-replication)
- row-oriented storage, [Column-Oriented Storage](/en/ch4#sec_storage_column)
- rowhammer (memory corruption), [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults)
- RPCs (see remote procedure calls)
- rules (Datalog), [Datalog: Recursive Relational Queries](/en/ch3#id62)
- Rust (programming language)
- memory management, [Limiting the impact of garbage collection](/en/ch9#sec_distributed_gc_impact)
### S
- S3 (object storage), [Layering of cloud services](/en/ch1#layering-of-cloud-services), [Setting Up New Followers](/en/ch6#sec_replication_new_replica), [Batch Processing](/en/ch11#ch_batch), [Distributed Filesystems](/en/ch11#sec_batch_dfs), [Object Stores](/en/ch11#id277)
- checking data integrity, [Don't just blindly trust what they promise](/en/ch13#id364)
- conditional writes, [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens)
- object size, [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute)
- S3 Express One Zone, [Object Stores](/en/ch11#id277), [Object Stores](/en/ch11#id277)
- use in MapReduce, [MapReduce](/en/ch11#sec_batch_mapreduce)
- workflow example, [Scheduling Workflows](/en/ch11#sec_batch_workflows)
- SaaS (see software as a service (SaaS))
- safety and liveness properties, [Safety and liveness](/en/ch9#sec_distributed_safety_liveness)
- in consensus algorithms, [Single-value consensus](/en/ch10#single-value-consensus)
- in transactions, [Transactions](/en/ch8#ch_transactions)
- sagas (see compensating transactions)
- Samza (stream processor), [Stream analytics](/en/ch12#id318)
- SAP HANA (database), [Data Storage for Analytics](/en/ch4#sec_storage_analytics)
- scalability, [Scalability](/en/ch2#sec_introduction_scalability)-[Principles for Scalability](/en/ch2#id35), [A Philosophy of Streaming Systems](/en/ch13#ch_philosophy)
- auto-scaling, [Operations: Automatic or Manual Rebalancing](/en/ch7#sec_sharding_operations)
- by sharding, [Pros and Cons of Sharding](/en/ch7#sec_sharding_reasons)
- describing load, [Describing Load](/en/ch2#id33)
- describing performance, [Describing Performance](/en/ch2#sec_introduction_percentiles)
- linear, [Describing Load](/en/ch2#id33)
- principles for, [Principles for Scalability](/en/ch2#id35)
- replication and, [Problems with Replication Lag](/en/ch6#sec_replication_lag)
- scaling up versus scaling out, [Shared-Memory, Shared-Disk, and Shared-Nothing Architecture](/en/ch2#sec_introduction_shared_nothing)
- scaling out, [Shared-Memory, Shared-Disk, and Shared-Nothing Architecture](/en/ch2#sec_introduction_shared_nothing)
- (see also shared-nothing architecture)
- by sharding, [Pros and Cons of Sharding](/en/ch7#sec_sharding_reasons)
- scaling up, [Shared-Memory, Shared-Disk, and Shared-Nothing Architecture](/en/ch2#sec_introduction_shared_nothing)
- SCD (slowly changing dimension), [Time-dependence of joins](/en/ch12#sec_stream_join_time)
- scheduling
- algorithms, [Resource Allocation](/en/ch11#id279)
- batch jobs, [Distributed Job Orchestration](/en/ch11#id278)-[Scheduling Workflows](/en/ch11#sec_batch_workflows)
- gang scheduling, [Resource Allocation](/en/ch11#id279)
- schema-on-read, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility)
- comparison to evolvable schema, [The Merits of Schemas](/en/ch5#sec_encoding_schemas)
- schema-on-write, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility)
- schemaless databases (see schema-on-read)
- schemas, [Glossary](/en/glossary)
- Avro, [Avro](/en/ch5#sec_encoding_avro)-[Dynamically generated schemas](/en/ch5#dynamically-generated-schemas)
- reader determining writer's schema, [But what is the writer's schema?](/en/ch5#but-what-is-the-writers-schema)
- schema evolution, [The writer's schema and the reader's schema](/en/ch5#the-writers-schema-and-the-readers-schema)
- dynamically generated, [Dynamically generated schemas](/en/ch5#dynamically-generated-schemas)
- evolution of, [Reprocessing data for application evolution](/en/ch13#sec_future_reprocessing)
- affecting application code, [Encoding and Evolution](/en/ch5#ch_encoding)
- compatibility checking, [But what is the writer's schema?](/en/ch5#but-what-is-the-writers-schema)
- in databases, [Dataflow Through Databases](/en/ch5#sec_encoding_dataflow_db)-[Archival storage](/en/ch5#archival-storage)
- in service calls, [Data encoding and evolution for RPC](/en/ch5#data-encoding-and-evolution-for-rpc)
- flexibility in document model, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility)
- for analytics, [Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)-[Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)
- for JSON and XML, [JSON, XML, and Binary Variants](/en/ch5#sec_encoding_json), [JSON Schema](/en/ch5#json-schema)
- generation and migration using ORMs, [Object-relational mapping (ORM)](/en/ch3#object-relational-mapping-orm)
- merits of, [The Merits of Schemas](/en/ch5#sec_encoding_schemas)
- migration, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility)
- Protocol Buffers, [Protocol Buffers](/en/ch5#sec_encoding_protobuf)-[Field tags and schema evolution](/en/ch5#field-tags-and-schema-evolution)
- schema evolution, [Field tags and schema evolution](/en/ch5#field-tags-and-schema-evolution)
- schema migration on railways, [Reprocessing data for application evolution](/en/ch13#sec_future_reprocessing)
- traditional approach to design, fallacy in, [Deriving several views from the same event log](/en/ch12#sec_stream_deriving_views)
- scientific computing, [Cloud Computing Versus Supercomputing](/en/ch1#id17)
- scikit-learn (Python library), [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake)
- ScyllaDB (database)
- cluster metadata, [Request Routing](/en/ch7#sec_sharding_routing)
- consistency level ANY, [Single-Leader Versus Leaderless Replication Performance](/en/ch6#sec_replication_leaderless_perf)
- hash-range sharding, [Sharding by Hash of Key](/en/ch7#sec_sharding_hash), [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- last-write-wins conflict resolution, [Detecting Concurrent Writes](/en/ch6#sec_replication_concurrent)
- leaderless replication, [Leaderless Replication](/en/ch6#sec_replication_leaderless)
- lightweight transactions, [Single-object writes](/en/ch8#sec_transactions_single_object)
- linearizability, lack of, [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable)
- log-structured storage, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- multi-region support, [Multi-region operation](/en/ch6#multi-region-operation)
- use of clocks, [Limitations of Quorum Consistency](/en/ch6#sec_replication_quorum_limitations), [Timestamps for ordering events](/en/ch9#sec_distributed_lww)
- vnodes (sharding), [Sharding](/en/ch7#ch_sharding)
- search engines (see full-text search)
- searching on streams, [Search on streams](/en/ch12#id320)
- secondaries (see leader-based replication)
- secondary indexes, [Multi-Column and Secondary Indexes](/en/ch4#sec_storage_index_multicolumn), [Glossary](/en/glossary)
- for many-to-many relationships, [Many-to-One and Many-to-Many Relationships](/en/ch3#sec_datamodels_many_to_many)
- problems with dual writes, [Keeping Systems in Sync](/en/ch12#sec_stream_sync), [Reasoning about dataflows](/en/ch13#id443)
- sharding, [Sharding and Secondary Indexes](/en/ch7#sec_sharding_secondary_indexes)-[Global Secondary Indexes](/en/ch7#id167), [Summary](/en/ch7#summary)
- global, [Global Secondary Indexes](/en/ch7#id167)
- index maintenance, [Maintaining derived state](/en/ch13#id446)
- local, [Local Secondary Indexes](/en/ch7#id166)
- updating, transaction isolation and, [The need for multi-object transactions](/en/ch8#sec_transactions_need)
- secondary sort (MapReduce), [JOIN and GROUP BY](/en/ch11#sec_batch_join)
- sed (Unix tool), [Simple Log Analysis](/en/ch11#sec_batch_log_analysis)
- self-hosting, [Cloud Versus Self-Hosting](/en/ch1#sec_introduction_cloud)
- data warehouses, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- self-joins, [Summary](/en/ch12#id332)
- self-validating systems, [Don't just blindly trust what they promise](/en/ch13#id364)
- semantic search, [Vector Embeddings](/en/ch4#id92)
- semantic similarity, [Vector Embeddings](/en/ch4#id92)
- semantic web, [Triple-Stores and SPARQL](/en/ch3#id59)
- semi-synchronous replication, [Synchronous Versus Asynchronous Replication](/en/ch6#sec_replication_sync_async)
- sequential writes (access pattern), [Sequential versus random writes](/en/ch4#sidebar_sequential)
- serializability, [Isolation](/en/ch8#sec_transactions_acid_isolation), [Weak Isolation Levels](/en/ch8#sec_transactions_isolation_levels), [Serializability](/en/ch8#sec_transactions_serializability)-[Performance of serializable snapshot isolation](/en/ch8#performance-of-serializable-snapshot-isolation), [Glossary](/en/glossary)
- linearizability versus, [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- pessimistic versus optimistic concurrency control, [Pessimistic versus optimistic concurrency control](/en/ch8#pessimistic-versus-optimistic-concurrency-control)
- serial execution, [Actual Serial Execution](/en/ch8#sec_transactions_serial)-[Summary of serial execution](/en/ch8#summary-of-serial-execution)
- sharding, [Sharding](/en/ch8#sharding)
- using stored procedures, [Encapsulating transactions in stored procedures](/en/ch8#encapsulating-transactions-in-stored-procedures), [Using shared logs](/en/ch10#sec_consistency_smr)
- serializable snapshot isolation (SSI), [Serializable Snapshot Isolation (SSI)](/en/ch8#sec_transactions_ssi)-[Performance of serializable snapshot isolation](/en/ch8#performance-of-serializable-snapshot-isolation)
- detecting stale MVCC reads, [Detecting stale MVCC reads](/en/ch8#detecting-stale-mvcc-reads)
- detecting writes that affect prior reads, [Detecting writes that affect prior reads](/en/ch8#sec_detecting_writes_affect_reads)
- distributed execution, [Performance of serializable snapshot isolation](/en/ch8#performance-of-serializable-snapshot-isolation), [Database-internal Distributed Transactions](/en/ch8#sec_transactions_internal)
- performance of SSI, [Performance of serializable snapshot isolation](/en/ch8#performance-of-serializable-snapshot-isolation)
- preventing write skew, [Decisions based on an outdated premise](/en/ch8#decisions-based-on-an-outdated-premise)-[Detecting writes that affect prior reads](/en/ch8#sec_detecting_writes_affect_reads)
- strict serializability, [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- timeliness vs. integrity, [Timeliness and Integrity](/en/ch13#sec_future_integrity)
- two-phase locking (2PL), [Two-Phase Locking (2PL)](/en/ch8#sec_transactions_2pl)-[Index-range locks](/en/ch8#sec_transactions_2pl_range)
- index-range locks, [Index-range locks](/en/ch8#sec_transactions_2pl_range)
- performance, [Performance of two-phase locking](/en/ch8#performance-of-two-phase-locking)
- Serializable (Java), [Language-Specific Formats](/en/ch5#id96)
- serialization, [Formats for Encoding Data](/en/ch5#sec_encoding_formats)
- (see also encoding)
- serverless, [Microservices and Serverless](/en/ch1#sec_introduction_microservices)
- service discovery, [Load balancers, service discovery, and service meshes](/en/ch5#sec_encoding_service_discovery), [Request Routing](/en/ch7#sec_sharding_routing), [Service discovery](/en/ch10#service-discovery)
- registration, [Load balancers, service discovery, and service meshes](/en/ch5#sec_encoding_service_discovery)
- using DNS, [Load balancers, service discovery, and service meshes](/en/ch5#sec_encoding_service_discovery), [Request Routing](/en/ch7#sec_sharding_routing), [Service discovery](/en/ch10#service-discovery)
- service level agreements (SLAs), [Use of Response Time Metrics](/en/ch2#sec_introduction_slo_sla), [Describing Load](/en/ch2#id33)
- service mesh, [Load balancers, service discovery, and service meshes](/en/ch5#sec_encoding_service_discovery)
- Service Organization Control (SOC), [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance)
- service time, [Latency and Response Time](/en/ch2#id23)
- service-oriented architecture (SOA), [Microservices and Serverless](/en/ch1#sec_introduction_microservices)
- (see also services)
- services, [Dataflow Through Services: REST and RPC](/en/ch5#sec_encoding_dataflow_rpc)-[Data encoding and evolution for RPC](/en/ch5#data-encoding-and-evolution-for-rpc)
- microservices, [Microservices and Serverless](/en/ch1#sec_introduction_microservices)
- causal dependencies across services, [The limits of total ordering](/en/ch13#id335)
- loose coupling, [Making unbundling work](/en/ch13#sec_future_unbundling_favor)
- relation to batch/stream processors, [Batch Processing](/en/ch11#ch_batch), [Stream processors and services](/en/ch13#id345)
- remote procedure calls (RPCs), [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc)-[Data encoding and evolution for RPC](/en/ch5#data-encoding-and-evolution-for-rpc)
- issues with, [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc)
- similarity to databases, [Dataflow Through Services: REST and RPC](/en/ch5#sec_encoding_dataflow_rpc)
- web services, [Web services](/en/ch5#sec_web_services)
- session windows (stream processing), [Types of windows](/en/ch12#id324)
- (see also windows)
- sharding, [Sharding](/en/ch7#ch_sharding)-[Summary](/en/ch7#summary), [Glossary](/en/glossary)
- and consensus, [Using shared logs](/en/ch10#sec_consistency_smr)
- and replication, [Sharding](/en/ch7#ch_sharding)
- distributed transactions across shards, [Distributed Transactions](/en/ch8#sec_transactions_distributed)
- hot shards, [Sharding of Key-Value Data](/en/ch7#sec_sharding_key_value)
- in batch processing, [Batch Processing](/en/ch11#ch_batch)
- key-range splitting, [Rebalancing key-range sharded data](/en/ch7#rebalancing-key-range-sharded-data)
- multi-shard operations, [Multi-shard data processing](/en/ch13#sec_future_unbundled_multi_shard)
- enforcing constraints, [Multi-shard request processing](/en/ch13#id360)
- secondary index maintenance, [Maintaining derived state](/en/ch13#id446)
- of key-value data, [Sharding of Key-Value Data](/en/ch7#sec_sharding_key_value)-[Skewed Workloads and Relieving Hot Spots](/en/ch7#sec_sharding_skew)
- by key range, [Sharding by Key Range](/en/ch7#sec_sharding_key_range)
- skew and hot spots, [Skewed Workloads and Relieving Hot Spots](/en/ch7#sec_sharding_skew)
- origin of the term, [Sharding](/en/ch7#ch_sharding)
- partition key, [Pros and Cons of Sharding](/en/ch7#sec_sharding_reasons), [Sharding of Key-Value Data](/en/ch7#sec_sharding_key_value)
- rebalancing
- of key-range sharded data, [Rebalancing key-range sharded data](/en/ch7#rebalancing-key-range-sharded-data)
- rebalancing shards, [Rebalancing key-range sharded data](/en/ch7#rebalancing-key-range-sharded-data)-[Operations: Automatic or Manual Rebalancing](/en/ch7#sec_sharding_operations)
- automatic or manual rebalancing, [Operations: Automatic or Manual Rebalancing](/en/ch7#sec_sharding_operations)
- problems with hash mod N, [Hash modulo number of nodes](/en/ch7#hash-modulo-number-of-nodes)
- using fixed number of shards, [Fixed number of shards](/en/ch7#fixed-number-of-shards)
- using N shards per node, [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- request routing, [Request Routing](/en/ch7#sec_sharding_routing)-[Request Routing](/en/ch7#sec_sharding_routing)
- secondary indexes, [Sharding and Secondary Indexes](/en/ch7#sec_sharding_secondary_indexes)-[Global Secondary Indexes](/en/ch7#id167)
- global, [Global Secondary Indexes](/en/ch7#id167)
- local, [Local Secondary Indexes](/en/ch7#id166)
- serial execution of transactions and, [Sharding](/en/ch8#sharding)
- sorting sharded data, [Shuffling Data](/en/ch11#sec_shuffle)
- shared logs, [Consensus in Practice](/en/ch10#sec_consistency_total_order)-[Pros and cons of consensus](/en/ch10#pros-and-cons-of-consensus), [The limits of total ordering](/en/ch13#id335), [Uniqueness in log-based messaging](/en/ch13#sec_future_uniqueness_log)
- algorithms, [Consensus in Practice](/en/ch10#sec_consistency_total_order)
- for event sourcing, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- for messaging, [Log-based Message Brokers](/en/ch12#sec_stream_log)-[Replaying old messages](/en/ch12#sec_stream_replay)
- relation to consensus, [Shared logs as consensus](/en/ch10#sec_consistency_shared_logs)
- using, [Using shared logs](/en/ch10#sec_consistency_smr)
- shared mode (locks), [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- shared-disk architecture, [Shared-Memory, Shared-Disk, and Shared-Nothing Architecture](/en/ch2#sec_introduction_shared_nothing), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- shared-memory architecture, [Shared-Memory, Shared-Disk, and Shared-Nothing Architecture](/en/ch2#sec_introduction_shared_nothing)
- shared-nothing architecture, [Shared-Memory, Shared-Disk, and Shared-Nothing Architecture](/en/ch2#sec_introduction_shared_nothing), [Glossary](/en/glossary)
- distributed filesystems, [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- (see also distributed filesystems)
- use of network, [Unreliable Networks](/en/ch9#sec_distributed_networks)
- sharks
- biting undersea cables, [Network Faults in Practice](/en/ch9#sec_distributed_network_faults)
- counting (example), [Query languages for documents](/en/ch3#query-languages-for-documents)
- shredding (deletion) (see crypto-shredding)
- shredding (in columnar encoding), [Column-Oriented Storage](/en/ch4#sec_storage_column)
- shredding (in relational model), [When to Use Which Model](/en/ch3#sec_datamodels_document_summary)
- shuffle (batch processing), [Shuffling Data](/en/ch11#sec_shuffle)-[Shuffling Data](/en/ch11#sec_shuffle)
- siblings (concurrent values), [Manual conflict resolution](/en/ch6#manual-conflict-resolution), [Capturing the happens-before relationship](/en/ch6#capturing-the-happens-before-relationship), [Conflict resolution and replication](/en/ch8#conflict-resolution-and-replication)
- (see also conflicts)
- silo, [Data Warehousing](/en/ch1#sec_introduction_dwh)
- similarity search
- edit distance, [Full-Text Search](/en/ch4#sec_storage_full_text)
- genome data, [Summary](/en/ch3#summary)
- simplicity, [Simplicity: Managing Complexity](/en/ch2#id38)
- Singer, [Data Warehousing](/en/ch1#sec_introduction_dwh)
- single-instruction-multi-data (SIMD) instructions, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- single-leader replication (see leader-based replication)
- single-threaded execution, [Atomic write operations](/en/ch8#atomic-write-operations), [Actual Serial Execution](/en/ch8#sec_transactions_serial)
- in stream processing, [Logs compared to traditional messaging](/en/ch12#sec_stream_logs_vs_messaging), [Concurrency control](/en/ch12#sec_stream_concurrency), [Uniqueness in log-based messaging](/en/ch13#sec_future_uniqueness_log)
- SingleStore (database)
- in-memory storage, [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- site reliability engineer, [Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
- size-tiered compaction, [Compaction strategies](/en/ch4#sec_storage_lsm_compaction), [Disk space usage](/en/ch4#disk-space-usage)
- skew, [Glossary](/en/glossary)
- clock skew, [Relying on Synchronized Clocks](/en/ch9#sec_distributed_clocks_relying)-[Clock readings with a confidence interval](/en/ch9#clock-readings-with-a-confidence-interval), [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable)
- in transaction isolation
- read skew, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation), [Summary](/en/ch8#summary)
- write skew, [Write Skew and Phantoms](/en/ch8#sec_transactions_write_skew)-[Materializing conflicts](/en/ch8#materializing-conflicts), [Decisions based on an outdated premise](/en/ch8#decisions-based-on-an-outdated-premise)-[Detecting writes that affect prior reads](/en/ch8#sec_detecting_writes_affect_reads)
- (see also write skew)
- meanings of, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)
- unbalanced workload, [Sharding of Key-Value Data](/en/ch7#sec_sharding_key_value)
- compensating for, [Skewed Workloads and Relieving Hot Spots](/en/ch7#sec_sharding_skew)
- due to celebrities, [Skewed Workloads and Relieving Hot Spots](/en/ch7#sec_sharding_skew)
- for time-series data, [Sharding by Key Range](/en/ch7#sec_sharding_key_range)
- skip list, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- SLA (see service level agreements)
- Slack (group chat)
- GraphQL example, [GraphQL](/en/ch3#id63)
- SlateDB (database), [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables), [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- sliding windows (stream processing), [Types of windows](/en/ch12#id324)
- (see also windows)
- sloppy quorums, [Single-Leader Versus Leaderless Replication Performance](/en/ch6#sec_replication_leaderless_perf)
- slowly changing dimension (data warehouses), [Time-dependence of joins](/en/ch12#sec_stream_join_time)
- smearing (leap seconds adjustments), [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy)
- snapshots (databases)
- as backups, [Replication](/en/ch6#ch_replication)
- computing derived data, [Creating an index](/en/ch13#id340)
- in change data capture, [Initial snapshot](/en/ch12#sec_stream_cdc_snapshot)
- serializable snapshot isolation (SSI), [Serializable Snapshot Isolation (SSI)](/en/ch8#sec_transactions_ssi)-[Performance of serializable snapshot isolation](/en/ch8#performance-of-serializable-snapshot-isolation)
- setting up a new replica, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- snapshot isolation and repeatable read, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)-[Snapshot isolation, repeatable read, and naming confusion](/en/ch8#snapshot-isolation-repeatable-read-and-naming-confusion)
- implementing with MVCC, [Multi-version concurrency control (MVCC)](/en/ch8#sec_transactions_snapshot_impl)
- indexes and MVCC, [Indexes and snapshot isolation](/en/ch8#indexes-and-snapshot-isolation)
- visibility rules, [Visibility rules for observing a consistent snapshot](/en/ch8#sec_transactions_mvcc_visibility)
- synchronized clocks for global snapshots, [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- Snowflake (database), [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native), [Layering of cloud services](/en/ch1#layering-of-cloud-services), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses), [Batch Processing](/en/ch11#ch_batch)
- column-oriented storage, [Column-Oriented Storage](/en/ch4#sec_storage_column)
- handling writes, [Writing to Column-Oriented Storage](/en/ch4#writing-to-column-oriented-storage)
- sharding and clustering, [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- Snowpark, [Query languages](/en/ch11#sec_batch_query_lanauges)
- Snowflake (ID generator), [ID Generators and Logical Clocks](/en/ch10#sec_consistency_logical)
- snowflake schemas, [Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)
- SOAP (web services), [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc)
- SOC2 (see Service Organization Control (SOC))
- social graph, [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- society
- responsibility towards, [Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance), [Legislation and Self-Regulation](/en/ch14#sec_future_legislation)
- sociotechnical systems, [Humans and Reliability](/en/ch2#id31)
- software as a service (SaaS), [Trade-offs in Data Systems Architecture](/en/ch1#ch_tradeoffs), [Cloud Versus Self-Hosting](/en/ch1#sec_introduction_cloud)
- ETL from, [Data Warehousing](/en/ch1#sec_introduction_dwh)
- multitenancy, [Sharding for Multitenancy](/en/ch7#sec_sharding_multitenancy)
- software bugs, [Software faults](/en/ch2#software-faults)
- maintaining integrity, [Maintaining integrity in the face of software bugs](/en/ch13#id455)
- solar storm, [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults)
- solid state drives (SSDs)
- access patterns, [Sequential versus random writes](/en/ch4#sidebar_sequential)
- compared to object storage, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- detecting corruption, [The end-to-end argument](/en/ch13#sec_future_e2e_argument), [Don't just blindly trust what they promise](/en/ch13#id364)
- failure rate, [Hardware and Software Faults](/en/ch2#sec_introduction_hardware_faults)
- faults in, [Durability](/en/ch8#durability)
- firmware bugs, [Software faults](/en/ch2#software-faults)
- read throughput, [Read performance](/en/ch4#read-performance)
- sequential vs. random writes, [Sequential versus random writes](/en/ch4#sidebar_sequential)
- Solr (search server)
- local secondary indexes, [Local Secondary Indexes](/en/ch7#id166)
- request routing, [Request Routing](/en/ch7#sec_sharding_routing)
- use of Lucene, [Full-Text Search](/en/ch4#sec_storage_full_text)
- sort (Unix tool), [Simple Log Analysis](/en/ch11#sec_batch_log_analysis), [Simple Log Analysis](/en/ch11#sec_batch_log_analysis), [Sorting Versus In-memory Aggregation](/en/ch11#id275), [Distributed Job Orchestration](/en/ch11#id278)
- sort-merge joins (MapReduce), [JOIN and GROUP BY](/en/ch11#sec_batch_join)
- Sorted String Tables (see SSTables)
- sorting
- sort order in column storage, [Sort Order in Column Storage](/en/ch4#sort-order-in-column-storage)
- source of truth (see systems of record)
- Spanner (database)
- consistency model, [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- data locality, [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality)
- in the cloud, [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native)
- snapshot isolation using clocks, [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- transactions, [What Exactly Is a Transaction?](/en/ch8#sec_transactions_overview), [Database-internal Distributed Transactions](/en/ch8#sec_transactions_internal)
- TrueTime API, [Clock readings with a confidence interval](/en/ch9#clock-readings-with-a-confidence-interval)
- Spark (processing framework), [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake), [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native), [Batch Processing](/en/ch11#ch_batch), [Dataflow Engines](/en/ch11#sec_batch_dataflow)
- cost efficiency, [Query languages](/en/ch11#sec_batch_query_lanauges)
- DataFrames, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes), [DataFrames](/en/ch11#id287)
- fault tolerance, [Handling Faults](/en/ch11#id281)
- for data warehouses, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- high availability using ZooKeeper, [Coordination Services](/en/ch10#sec_consistency_coordination)
- MLlib, [Machine Learning](/en/ch11#id290)
- query optimizer, [Query languages](/en/ch11#sec_batch_query_lanauges)
- shuffling data, [Shuffling Data](/en/ch11#sec_shuffle)
- Spark Streaming, [Stream analytics](/en/ch12#id318)
- microbatching, [Microbatching and checkpointing](/en/ch12#id329)
- streaming SQL support, [Complex event processing](/en/ch12#id317)
- use for ETL, [Extract--Transform--Load (ETL)](/en/ch11#sec_batch_etl_usage)
- SPARQL (query language), [The SPARQL query language](/en/ch3#the-sparql-query-language)
- sparse index, [The SSTable file format](/en/ch4#the-sstable-file-format)
- sparse matrices, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- split brain, [Leader failure: Failover](/en/ch6#leader-failure-failover), [Request Routing](/en/ch7#sec_sharding_routing), [Glossary](/en/glossary)
- enforcing constraints, [Uniqueness constraints require consensus](/en/ch13#id452)
- in consensus algorithms, [Consensus](/en/ch10#sec_consistency_consensus), [From single-leader replication to consensus](/en/ch10#from-single-leader-replication-to-consensus)
- preventing, [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable)
- using fencing tokens to avoid, [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens)-[Fencing with multiple replicas](/en/ch9#fencing-with-multiple-replicas)
- spot instances, [Handling Faults](/en/ch11#id281)
- spreadsheets, [Trade-offs in Data Systems Architecture](/en/ch1#ch_tradeoffs), [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- dataflow programming, [Designing Applications Around Dataflow](/en/ch13#sec_future_dataflow)
- pivot table, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- SQL (Structured Query Language), [Simplicity: Managing Complexity](/en/ch2#id38), [Relational Model versus Document Model](/en/ch3#sec_datamodels_history), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- for analytics, [Data Warehousing](/en/ch1#sec_introduction_dwh), [Column-Oriented Storage](/en/ch4#sec_storage_column)
- graph queries in, [Graph Queries in SQL](/en/ch3#id58)
- isolation levels standard, issues with, [Snapshot isolation, repeatable read, and naming confusion](/en/ch8#snapshot-isolation-repeatable-read-and-naming-confusion)
- joins, [Normalization, Denormalization, and Joins](/en/ch3#sec_datamodels_normalization)
- résumé (example), [The document data model for one-to-many relationships](/en/ch3#the-document-data-model-for-one-to-many-relationships)
- social network home timelines (example), [Representing Users, Posts, and Follows](/en/ch2#id20)
- SQL injection vulnerability, [Byzantine Faults](/en/ch9#sec_distributed_byzantine)
- statement-based replication, [Statement-based replication](/en/ch6#statement-based-replication)
- stored procedures, [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs)
- support in batch processing frameworks, [Batch Processing](/en/ch11#ch_batch)
- views, [Datalog: Recursive Relational Queries](/en/ch3#id62)
- SQL Server (database)
- archiving WAL to object stores, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- change data capture, [Implementing change data capture](/en/ch12#id307)
- data warehousing support, [Data Storage for Analytics](/en/ch4#sec_storage_analytics)
- distributed transaction support, [XA transactions](/en/ch8#xa-transactions)
- leader-based replication, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- multi-leader replication, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)
- preventing lost updates, [Automatically detecting lost updates](/en/ch8#automatically-detecting-lost-updates)
- preventing write skew, [Characterizing write skew](/en/ch8#characterizing-write-skew), [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- read committed isolation, [Implementing read committed](/en/ch8#sec_transactions_read_committed_impl)
- serializable isolation, [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- snapshot isolation support, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)
- T-SQL language, [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs)
- SQLite (database), [Problems with Distributed Systems](/en/ch1#sec_introduction_dist_sys_problems), [Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- archiving WAL to object stores, [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- SRE (site reliability engineer), [Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
- SSDs (see solid state drives)
- SSTables (storage format), [The SSTable file format](/en/ch4#the-sstable-file-format)-[Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- constructing and maintaining, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- making LSM-Tree from, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- staged rollout (see rolling upgrades)
- staleness (old data), [Reading Your Own Writes](/en/ch6#sec_replication_ryw)
- cross-channel timing dependencies, [Cross-channel timing dependencies](/en/ch10#cross-channel-timing-dependencies)
- in leaderless databases, [Writing to the Database When a Node Is Down](/en/ch6#id287)
- in multi-version concurrency control, [Detecting stale MVCC reads](/en/ch8#detecting-stale-mvcc-reads)
- monitoring for, [Monitoring staleness](/en/ch6#monitoring-staleness)
- of client state, [Pushing state changes to clients](/en/ch13#id348)
- versus linearizability, [Linearizability](/en/ch10#sec_consistency_linearizability)
- versus timeliness, [Timeliness and Integrity](/en/ch13#sec_future_integrity)
- standbys (see leader-based replication)
- star replication topologies, [Multi-leader replication topologies](/en/ch6#sec_replication_topologies)
- star schemas, [Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)-[Stars and Snowflakes: Schemas for Analytics](/en/ch3#sec_datamodels_analytics)
- Star Wars analogy (event time versus processing time), [Event time versus processing time](/en/ch12#id322)
- starvation (scheduling), [Resource Allocation](/en/ch11#id279)
- state
- derived from log of immutable events, [State, Streams, and Immutability](/en/ch12#sec_stream_immutability)
- interplay between state changes and application code, [Dataflow: Interplay between state changes and application code](/en/ch13#id450)
- maintaining derived state, [Maintaining derived state](/en/ch13#id446)
- maintenance by stream processor in stream-stream joins, [Stream-stream join (window join)](/en/ch12#id440)
- observing derived state, [Observing Derived State](/en/ch13#sec_future_observing)-[Multi-shard data processing](/en/ch13#sec_future_unbundled_multi_shard)
- rebuilding after stream processor failure, [Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance)
- separation of application code and, [Separation of application code and state](/en/ch13#id344)
- state machine replication, [Statement-based replication](/en/ch6#statement-based-replication), [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs), [Using shared logs](/en/ch10#sec_consistency_smr), [Databases and Streams](/en/ch12#sec_stream_databases)
- event sourcing, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- reliance on determinism, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- stateless systems, [Trade-offs in Data Systems Architecture](/en/ch1#ch_tradeoffs)
- statement-based replication, [Statement-based replication](/en/ch6#statement-based-replication)
- reliance on determinism, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- statically typed languages
- analogy to schema-on-write, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility)
- statistical and numerical algorithms, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- StatsD (metrics aggregator), [Direct messaging from producers to consumers](/en/ch12#id296)
- stock market feeds, [Direct messaging from producers to consumers](/en/ch12#id296)
- STONITH (Shoot The Other Node In The Head), [Leader failure: Failover](/en/ch6#leader-failure-failover)
- problems with, [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens)
- stop-the-world (see garbage collection)
- storage
- composing data storage technologies, [Composing Data Storage Technologies](/en/ch13#id447)-[Unbundled versus integrated systems](/en/ch13#id448)
- Storage Area Network (SAN), [Shared-Memory, Shared-Disk, and Shared-Nothing Architecture](/en/ch2#sec_introduction_shared_nothing), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- storage engines, [Storage and Retrieval](/en/ch4#ch_storage)-[Summary](/en/ch4#summary)
- column-oriented, [Column-Oriented Storage](/en/ch4#sec_storage_column)-[Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- column compression, [Column Compression](/en/ch4#sec_storage_column_compression)-[Column Compression](/en/ch4#sec_storage_column_compression)
- defined, [Column-Oriented Storage](/en/ch4#sec_storage_column)
- Parquet, [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses), [Column-Oriented Storage](/en/ch4#sec_storage_column), [Archival storage](/en/ch5#archival-storage)
- sort order in, [Sort Order in Column Storage](/en/ch4#sort-order-in-column-storage)-[Sort Order in Column Storage](/en/ch4#sort-order-in-column-storage)
- versus wide-column model, [Column Compression](/en/ch4#sec_storage_column_compression)
- writing to, [Writing to Column-Oriented Storage](/en/ch4#writing-to-column-oriented-storage)
- in-memory storage, [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- durability, [Durability](/en/ch8#durability)
- row-oriented, [Storage and Indexing for OLTP](/en/ch4#sec_storage_oltp)-[Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- B-trees, [B-Trees](/en/ch4#sec_storage_b_trees)-[B-tree variants](/en/ch4#b-tree-variants)
- comparing B-trees and LSM-trees, [Comparing B-Trees and LSM-Trees](/en/ch4#sec_storage_btree_lsm_comparison)-[Disk space usage](/en/ch4#disk-space-usage)
- defined, [Column-Oriented Storage](/en/ch4#sec_storage_column)
- log-structured, [Log-Structured Storage](/en/ch4#sec_storage_log_structured)-[Compaction strategies](/en/ch4#sec_storage_lsm_compaction)
- stored procedures, [Encapsulating transactions in stored procedures](/en/ch8#encapsulating-transactions-in-stored-procedures)-[Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs), [Glossary](/en/glossary)
- and shared logs, [Using shared logs](/en/ch10#sec_consistency_smr)
- pros and cons of, [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs)
- similarity to stream processors, [Application code as a derivation function](/en/ch13#sec_future_dataflow_derivation)
- Storm (stream processor), [Stream analytics](/en/ch12#id318)
- distributed RPC, [Event-Driven Architectures and RPC](/en/ch12#sec_stream_actors_drpc), [Multi-shard data processing](/en/ch13#sec_future_unbundled_multi_shard)
- Trident state handling, [Idempotence](/en/ch12#sec_stream_idempotence)
- straggler events, [Handling straggler events](/en/ch12#id323)
- Stream Control Transmission Protocol (SCTP), [The Limitations of TCP](/en/ch9#sec_distributed_tcp)
- stream processing, [Processing Streams](/en/ch12#sec_stream_processing)-[Summary](/en/ch12#id332), [Glossary](/en/glossary)
- accessing external services within job, [Stream-table join (stream enrichment)](/en/ch12#sec_stream_table_joins), [Microbatching and checkpointing](/en/ch12#id329), [Idempotence](/en/ch12#sec_stream_idempotence), [Exactly-once execution of an operation](/en/ch13#id353)
- combining with batch processing, [Unifying batch and stream processing](/en/ch13#id338)
- comparison to batch processing, [Processing Streams](/en/ch12#sec_stream_processing)
- complex event processing (CEP), [Complex event processing](/en/ch12#id317)
- fault tolerance, [Fault Tolerance](/en/ch12#sec_stream_fault_tolerance)-[Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance)
- atomic commit, [Atomic commit revisited](/en/ch12#sec_stream_atomic_commit)
- idempotence, [Idempotence](/en/ch12#sec_stream_idempotence)
- microbatching and checkpointing, [Microbatching and checkpointing](/en/ch12#id329)
- rebuilding state after a failure, [Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance)
- for data integration, [Batch and Stream Processing](/en/ch13#sec_future_batch_streaming)-[Unifying batch and stream processing](/en/ch13#id338)
- for event sourcing, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- maintaining derived state, [Maintaining derived state](/en/ch13#id446)
- maintenance of materialized views, [Maintaining materialized views](/en/ch12#sec_stream_mat_view)
- messaging systems (see messaging systems)
- reasoning about time, [Reasoning About Time](/en/ch12#sec_stream_time)-[Types of windows](/en/ch12#id324)
- event time versus processing time, [Event time versus processing time](/en/ch12#id322), [Microbatching and checkpointing](/en/ch12#id329), [Unifying batch and stream processing](/en/ch13#id338)
- knowing when window is ready, [Handling straggler events](/en/ch12#id323)
- types of windows, [Types of windows](/en/ch12#id324)
- relation to databases (see streams)
- relation to services, [Stream processors and services](/en/ch13#id345)
- relationship to batch processing, [Batch Processing](/en/ch11#ch_batch)
- search on streams, [Search on streams](/en/ch12#id320)
- single-threaded execution, [Logs compared to traditional messaging](/en/ch12#sec_stream_logs_vs_messaging), [Concurrency control](/en/ch12#sec_stream_concurrency)
- stream analytics, [Stream analytics](/en/ch12#id318)
- stream joins, [Stream Joins](/en/ch12#sec_stream_joins)-[Time-dependence of joins](/en/ch12#sec_stream_join_time)
- stream-stream join, [Stream-stream join (window join)](/en/ch12#id440)
- stream-table join, [Stream-table join (stream enrichment)](/en/ch12#sec_stream_table_joins)
- table-table join, [Table-table join (materialized view maintenance)](/en/ch12#id326)
- time-dependence of, [Time-dependence of joins](/en/ch12#sec_stream_join_time)
- streams, [Stream Processing](/en/ch12#ch_stream)-[Replaying old messages](/en/ch12#sec_stream_replay)
- end-to-end, pushing events to clients, [End-to-end event streams](/en/ch13#id349)
- messaging systems (see messaging systems)
- processing (see stream processing)
- relation to databases, [Databases and Streams](/en/ch12#sec_stream_databases)-[Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- (see also changelogs)
- API support for change streams, [API support for change streams](/en/ch12#sec_stream_change_api)
- change data capture, [Change Data Capture](/en/ch12#sec_stream_cdc)-[API support for change streams](/en/ch12#sec_stream_change_api)
- derivative of state by time, [State, Streams, and Immutability](/en/ch12#sec_stream_immutability)
- event sourcing, [Change data capture versus event sourcing](/en/ch12#sec_stream_event_sourcing)
- keeping systems in sync, [Keeping Systems in Sync](/en/ch12#sec_stream_sync)-[Keeping Systems in Sync](/en/ch12#sec_stream_sync)
- philosophy of immutable events, [State, Streams, and Immutability](/en/ch12#sec_stream_immutability)-[Limitations of immutability](/en/ch12#sec_stream_immutability_limitations)
- topics, [Transmitting Event Streams](/en/ch12#sec_stream_transmit)
- strict serializability, [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- timeliness vs. integrity, [Timeliness and Integrity](/en/ch13#sec_future_integrity)
- striping (in columnar encoding), [Column-Oriented Storage](/en/ch4#sec_storage_column)
- strong consistency (see linearizability)
- strong eventual consistency, [Automatic conflict resolution](/en/ch6#automatic-conflict-resolution)
- strong one-copy serializability, [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- subjects, predicates, and objects (in triple-stores), [Triple-Stores and SPARQL](/en/ch3#id59)
- subscribers (message streams), [Transmitting Event Streams](/en/ch12#sec_stream_transmit)
- (see also consumers)
- supercomputers, [Cloud Computing Versus Supercomputing](/en/ch1#id17)
- Superset (data visualization software), [Analytics](/en/ch11#sec_batch_olap)
- surveillance, [Surveillance](/en/ch14#id374)
- (see also privacy)
- sushi principle, [From data warehouse to data lake](/en/ch1#from-data-warehouse-to-data-lake)
- sustainability, [Distributed Versus Single-Node Systems](/en/ch1#sec_introduction_distributed)
- Swagger (service definition format), [Web services](/en/ch5#sec_web_services)
- swapping to disk (see virtual memory)
- Swift (programming language)
- memory management, [Limiting the impact of garbage collection](/en/ch9#sec_distributed_gc_impact)
- sync engines, [Sync Engines and Local-First Software](/en/ch6#sec_replication_offline_clients)-[Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- examples of, [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- for local-first software, [Real-time collaboration, offline-first, and local-first apps](/en/ch6#real-time-collaboration-offline-first-and-local-first-apps)
- synchronous networks, [Synchronous Versus Asynchronous Networks](/en/ch9#sec_distributed_sync_networks), [Glossary](/en/glossary)
- comparison to asynchronous networks, [Synchronous Versus Asynchronous Networks](/en/ch9#sec_distributed_sync_networks)
- system model, [System Model and Reality](/en/ch9#sec_distributed_system_model)
- synchronous replication, [Synchronous Versus Asynchronous Replication](/en/ch6#sec_replication_sync_async), [Glossary](/en/glossary)
- with multiple leaders, [Multi-Leader Replication](/en/ch6#sec_replication_multi_leader)
- system administrator, [Operations in the Cloud Era](/en/ch1#sec_introduction_operations)
- system models, [Knowledge, Truth, and Lies](/en/ch9#sec_distributed_truth), [System Model and Reality](/en/ch9#sec_distributed_system_model)-[Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- assumptions in, [Trust, but Verify](/en/ch13#sec_future_verification)
- correctness of algorithms, [Defining the correctness of an algorithm](/en/ch9#defining-the-correctness-of-an-algorithm)
- mapping to the real world, [Mapping system models to the real world](/en/ch9#mapping-system-models-to-the-real-world)
- safety and liveness, [Safety and liveness](/en/ch9#sec_distributed_safety_liveness)
- systems of record, [Systems of Record and Derived Data](/en/ch1#sec_introduction_derived), [Glossary](/en/glossary)
- change data capture, [Implementing change data capture](/en/ch12#id307), [Reasoning about dataflows](/en/ch13#id443)
- event logs, [Event Sourcing and CQRS](/en/ch3#sec_datamodels_events)
- treating event log as, [State, Streams, and Immutability](/en/ch12#sec_stream_immutability)
- systems thinking, [Feedback Loops](/en/ch14#id372)
### T
- t-digest (algorithm), [Use of Response Time Metrics](/en/ch2#sec_introduction_slo_sla)
- table-table joins, [Table-table join (materialized view maintenance)](/en/ch12#id326)
- Tableau (data visualization software), [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp), [Analytics](/en/ch11#sec_batch_olap)
- tail (Unix tool), [Using logs for message storage](/en/ch12#id300)
- tail latency (see latency)
- tail vertex (property graphs), [Property Graphs](/en/ch3#id56)
- task (workflows) (see workflow engines)
- TCP (Transmission Control Protocol), [The Limitations of TCP](/en/ch9#sec_distributed_tcp)
- comparison to circuit switching, [Can we not simply make network delays predictable?](/en/ch9#can-we-not-simply-make-network-delays-predictable)
- comparison to UDP, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- connection failures, [Detecting Faults](/en/ch9#id307)
- flow control, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing), [Messaging Systems](/en/ch12#sec_stream_messaging)
- packet checksums, [Weak forms of lying](/en/ch9#weak-forms-of-lying), [The end-to-end argument](/en/ch13#sec_future_e2e_argument), [Trust, but Verify](/en/ch13#sec_future_verification)
- reliability and duplicate suppression, [Duplicate suppression](/en/ch13#id354)
- retransmission timeouts, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- use for transaction sessions, [Single-Object and Multi-Object Operations](/en/ch8#sec_transactions_multi_object)
- Temporal (workflow engine), [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows)
- Tensorflow (machine learning library), [Machine Learning](/en/ch11#id290)
- Teradata (database), [Cloud-Native System Architecture](/en/ch1#sec_introduction_cloud_native), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- term-partitioned indexes (see global secondary indexes)
- termination (consensus), [Single-value consensus](/en/ch10#single-value-consensus), [Atomic commitment as consensus](/en/ch10#atomic-commitment-as-consensus)
- testing, [Humans and Reliability](/en/ch2#id31)
- thrashing (out of memory), [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- threads (concurrency)
- actor model, [Distributed actor frameworks](/en/ch5#distributed-actor-frameworks), [Event-Driven Architectures and RPC](/en/ch12#sec_stream_actors_drpc)
- (see also event-driven architecture)
- atomic operations, [Atomicity](/en/ch8#sec_transactions_acid_atomicity)
- background threads, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables)
- execution pauses, [Can we not simply make network delays predictable?](/en/ch9#can-we-not-simply-make-network-delays-predictable), [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)-[Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- memory barriers, [Linearizability and network delays](/en/ch10#linearizability-and-network-delays)
- preemption, [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- single (see single-threaded execution)
- three-phase commit, [Three-phase commit](/en/ch8#three-phase-commit)
- three-way relationships, [Property Graphs](/en/ch3#id56)
- Thrift (data format), [Protocol Buffers](/en/ch5#sec_encoding_protobuf)
- throughput, [Describing Performance](/en/ch2#sec_introduction_percentiles), [Describing Load](/en/ch2#id33), [Batch Processing](/en/ch11#ch_batch)
- TIBCO, [Message brokers](/en/ch5#message-brokers)
- Enterprise Message Service, [Message brokers compared to databases](/en/ch12#id297)
- StreamBase (stream analytics), [Complex event processing](/en/ch12#id317)
- TiDB (database)
- consensus-based replication, [Single-Leader Replication](/en/ch6#sec_replication_leader)
- regions (sharding), [Sharding](/en/ch7#ch_sharding)
- request routing, [Request Routing](/en/ch7#sec_sharding_routing)
- serving derived data, [Serving Derived Data](/en/ch11#sec_batch_serving_derived)
- sharded secondary indexes, [Global Secondary Indexes](/en/ch7#id167)
- snapshot isolation support, [Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)
- timestamp oracle, [Implementing a linearizable ID generator](/en/ch10#implementing-a-linearizable-id-generator)
- transactions, [What Exactly Is a Transaction?](/en/ch8#sec_transactions_overview), [Database-internal Distributed Transactions](/en/ch8#sec_transactions_internal)
- use of model-checking, [Model checking and specification languages](/en/ch9#model-checking-and-specification-languages)
- tiered storage, [Setting Up New Followers](/en/ch6#sec_replication_new_replica), [Disk space usage](/en/ch12#sec_stream_disk_usage)
- TigerBeetle (database), [Summary](/en/ch3#summary)
- deterministic simulation testing, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- TigerGraph (database)
- GSQL language, [Graph Queries in SQL](/en/ch3#id58)
- Tigris (object storage), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- TileDB (database), [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- time
- concurrency and, [The "happens-before" relation and concurrency](/en/ch6#sec_replication_happens_before)
- cross-channel timing dependencies, [Cross-channel timing dependencies](/en/ch10#cross-channel-timing-dependencies)
- in distributed systems, [Unreliable Clocks](/en/ch9#sec_distributed_clocks)-[Limiting the impact of garbage collection](/en/ch9#sec_distributed_gc_impact)
- (see also clocks)
- clock synchronization and accuracy, [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy)
- relying on synchronized clocks, [Relying on Synchronized Clocks](/en/ch9#sec_distributed_clocks_relying)-[Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- process pauses, [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)-[Limiting the impact of garbage collection](/en/ch9#sec_distributed_gc_impact)
- reasoning about, in stream processors, [Reasoning About Time](/en/ch12#sec_stream_time)-[Types of windows](/en/ch12#id324)
- event time versus processing time, [Event time versus processing time](/en/ch12#id322), [Microbatching and checkpointing](/en/ch12#id329), [Unifying batch and stream processing](/en/ch13#id338)
- knowing when window is ready, [Handling straggler events](/en/ch12#id323)
- timestamp of events, [Whose clock are you using, anyway?](/en/ch12#id438)
- types of windows, [Types of windows](/en/ch12#id324)
- system models for distributed systems, [System Model and Reality](/en/ch9#sec_distributed_system_model)
- time-dependence in stream joins, [Time-dependence of joins](/en/ch12#sec_stream_join_time)
- time series data
- as DataFrames, [DataFrames, Matrices, and Arrays](/en/ch3#sec_datamodels_dataframes)
- column-oriented storage, [Column-Oriented Storage](/en/ch4#sec_storage_column)
- time-of-day clocks, [Time-of-day clocks](/en/ch9#time-of-day-clocks)
- hybrid logical clocks, [Hybrid logical clocks](/en/ch10#hybrid-logical-clocks)
- timeliness, [Timeliness and Integrity](/en/ch13#sec_future_integrity)
- coordination-avoiding data systems, [Coordination-avoiding data systems](/en/ch13#id454)
- correctness of dataflow systems, [Correctness of dataflow systems](/en/ch13#id453)
- timeouts, [Unreliable Networks](/en/ch9#sec_distributed_networks), [Glossary](/en/glossary)
- dynamic configuration of, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- for failover, [Leader failure: Failover](/en/ch6#leader-failure-failover)
- length of, [Timeouts and Unbounded Delays](/en/ch9#sec_distributed_queueing)
- TimescaleDB (database), [Column-Oriented Storage](/en/ch4#sec_storage_column)
- timestamps, [Logical Clocks](/en/ch10#sec_consistency_timestamps)
- assigning to events in stream processing, [Whose clock are you using, anyway?](/en/ch12#id438)
- for read-after-write consistency, [Reading Your Own Writes](/en/ch6#sec_replication_ryw)
- for transaction ordering, [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
- insufficiency for enforcing constraints, [Enforcing constraints using logical clocks](/en/ch10#enforcing-constraints-using-logical-clocks)
- key range sharding by, [Sharding by Key Range](/en/ch7#sec_sharding_key_range)
- Lamport, [Lamport timestamps](/en/ch10#lamport-timestamps)
- logical, [Ordering events to capture causality](/en/ch13#sec_future_capture_causality)
- ordering events, [Timestamps for ordering events](/en/ch9#sec_distributed_lww)
- timestamp oracle, [Implementing a linearizable ID generator](/en/ch10#implementing-a-linearizable-id-generator)
- TLA+ (specification language), [Model checking and specification languages](/en/ch9#model-checking-and-specification-languages)
- token bucket (limiting retries), [Describing Performance](/en/ch2#sec_introduction_percentiles)
- tombstones, [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables), [Disk space usage](/en/ch4#disk-space-usage), [Log compaction](/en/ch12#sec_stream_log_compaction)
- topics (messaging), [Message brokers](/en/ch5#message-brokers), [Transmitting Event Streams](/en/ch12#sec_stream_transmit)
- torn pages (B-trees), [Making B-trees reliable](/en/ch4#sec_storage_btree_wal)
- total order, [Glossary](/en/glossary)
- broadcast (see shared logs)
- limits of, [The limits of total ordering](/en/ch13#id335)
- on logical timestamps, [Logical Clocks](/en/ch10#sec_consistency_timestamps)
- tracing, [Problems with Distributed Systems](/en/ch1#sec_introduction_dist_sys_problems)
- tracking behavioral data, [Privacy and Tracking](/en/ch14#id373)
- (see also privacy)
- trade-offs, [Trade-offs in Data Systems Architecture](/en/ch1#ch_tradeoffs)-[Data Systems, Law, and Society](/en/ch1#sec_introduction_compliance)
- transaction coordinator (see coordinator)
- transaction manager (see coordinator)
- transaction processing, [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp)-[Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp)
- comparison to analytics, [Characterizing Transaction Processing and Analytics](/en/ch1#sec_introduction_oltp)
- comparison to data warehousing, [Data Storage for Analytics](/en/ch4#sec_storage_analytics)
- transactions, [Transactions](/en/ch8#ch_transactions)-[Summary](/en/ch8#summary), [Glossary](/en/glossary)
- ACID properties of, [The Meaning of ACID](/en/ch8#sec_transactions_acid)
- atomicity, [Atomicity](/en/ch8#sec_transactions_acid_atomicity)
- consistency, [Consistency](/en/ch8#sec_transactions_acid_consistency)
- durability, [Making B-trees reliable](/en/ch4#sec_storage_btree_wal), [Durability](/en/ch8#durability)
- isolation, [Isolation](/en/ch8#sec_transactions_acid_isolation)
- and derived data integrity, [Timeliness and Integrity](/en/ch13#sec_future_integrity)
- and replication, [Solutions for Replication Lag](/en/ch6#id131)
- compensating (see compensating transactions)
- concept of, [What Exactly Is a Transaction?](/en/ch8#sec_transactions_overview)
- distributed transactions, [Distributed Transactions](/en/ch8#sec_transactions_distributed)-[Exactly-once message processing revisited](/en/ch8#exactly-once-message-processing-revisited)
- avoiding, [Derived data versus distributed transactions](/en/ch13#sec_future_derived_vs_transactions), [Making unbundling work](/en/ch13#sec_future_unbundling_favor), [Enforcing Constraints](/en/ch13#sec_future_constraints)-[Coordination-avoiding data systems](/en/ch13#id454)
- failure amplification, [Maintaining derived state](/en/ch13#id446)
- for sharded systems, [Pros and Cons of Sharding](/en/ch7#sec_sharding_reasons)
- in doubt/uncertain status, [Coordinator failure](/en/ch8#coordinator-failure), [Holding locks while in doubt](/en/ch8#holding-locks-while-in-doubt)
- two-phase commit, [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc)-[Three-phase commit](/en/ch8#three-phase-commit)
- use of, [Distributed Transactions Across Different Systems](/en/ch8#sec_transactions_xa)-[Exactly-once message processing](/en/ch8#sec_transactions_exactly_once)
- XA transactions, [XA transactions](/en/ch8#xa-transactions)-[Problems with XA transactions](/en/ch8#problems-with-xa-transactions)
- OLTP versus analytics queries, [Analytics](/en/ch11#sec_batch_olap)
- purpose of, [Transactions](/en/ch8#ch_transactions)
- serializability, [Serializability](/en/ch8#sec_transactions_serializability)-[Performance of serializable snapshot isolation](/en/ch8#performance-of-serializable-snapshot-isolation)
- actual serial execution, [Actual Serial Execution](/en/ch8#sec_transactions_serial)-[Summary of serial execution](/en/ch8#summary-of-serial-execution)
- pessimistic versus optimistic concurrency control, [Pessimistic versus optimistic concurrency control](/en/ch8#pessimistic-versus-optimistic-concurrency-control)
- serializable snapshot isolation (SSI), [Serializable Snapshot Isolation (SSI)](/en/ch8#sec_transactions_ssi)-[Performance of serializable snapshot isolation](/en/ch8#performance-of-serializable-snapshot-isolation)
- two-phase locking (2PL), [Two-Phase Locking (2PL)](/en/ch8#sec_transactions_2pl)-[Index-range locks](/en/ch8#sec_transactions_2pl_range)
- single-object and multi-object, [Single-Object and Multi-Object Operations](/en/ch8#sec_transactions_multi_object)-[Handling errors and aborts](/en/ch8#handling-errors-and-aborts)
- handling errors and aborts, [Handling errors and aborts](/en/ch8#handling-errors-and-aborts)
- need for multi-object transactions, [The need for multi-object transactions](/en/ch8#sec_transactions_need)
- single-object writes, [Single-object writes](/en/ch8#sec_transactions_single_object)
- snapshot isolation (see snapshots)
- strict serializability, [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition)
- weak isolation levels, [Weak Isolation Levels](/en/ch8#sec_transactions_isolation_levels)-[Materializing conflicts](/en/ch8#materializing-conflicts)
- preventing lost updates, [Preventing Lost Updates](/en/ch8#sec_transactions_lost_update)-[Conflict resolution and replication](/en/ch8#conflict-resolution-and-replication)
- read committed, [Read Committed](/en/ch8#sec_transactions_read_committed)-[Snapshot Isolation and Repeatable Read](/en/ch8#sec_transactions_snapshot_isolation)
- traversal (graphs), [Property Graphs](/en/ch3#id56)
- trie (data structure), [Constructing and merging SSTables](/en/ch4#constructing-and-merging-sstables), [Full-Text Search](/en/ch4#sec_storage_full_text)
- as SSTable index, [The SSTable file format](/en/ch4#the-sstable-file-format)
- triggers (databases), [Transmitting Event Streams](/en/ch12#sec_stream_transmit)
- Trino (data warehouse), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- federated databases, [The meta-database of everything](/en/ch13#id341)
- query optimizer, [Query languages](/en/ch11#sec_batch_query_lanauges)
- use for ETL, [Extract--Transform--Load (ETL)](/en/ch11#sec_batch_etl_usage)
- workflow example, [Scheduling Workflows](/en/ch11#sec_batch_workflows)
- triple-stores, [Triple-Stores and SPARQL](/en/ch3#id59)-[The SPARQL query language](/en/ch3#the-sparql-query-language)
- SPARQL query language, [The SPARQL query language](/en/ch3#the-sparql-query-language)
- tumbling windows (stream processing), [Types of windows](/en/ch12#id324)
- (see also windows)
- in microbatching, [Microbatching and checkpointing](/en/ch12#id329)
- Turbopuffer (vector search), [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- Turtle (RDF data format), [Triple-Stores and SPARQL](/en/ch3#id59)
- Twitter (see X (social network))
- two-phase commit (2PC), [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc)-[Coordinator failure](/en/ch8#coordinator-failure), [Glossary](/en/glossary)
- confusion with two-phase locking, [Two-Phase Locking (2PL)](/en/ch8#sec_transactions_2pl)
- coordinator failure, [Coordinator failure](/en/ch8#coordinator-failure)
- coordinator recovery, [Recovering from coordinator failure](/en/ch8#recovering-from-coordinator-failure)
- how it works, [A system of promises](/en/ch8#a-system-of-promises)
- performance cost, [Distributed Transactions Across Different Systems](/en/ch8#sec_transactions_xa)
- problems with XA transactions, [Problems with XA transactions](/en/ch8#problems-with-xa-transactions)
- transactions holding locks, [Holding locks while in doubt](/en/ch8#holding-locks-while-in-doubt)
- two-phase locking (2PL), [Two-Phase Locking (2PL)](/en/ch8#sec_transactions_2pl)-[Index-range locks](/en/ch8#sec_transactions_2pl_range), [What Makes a System Linearizable?](/en/ch10#sec_consistency_lin_definition), [Glossary](/en/glossary)
- confusion with two-phase commit, [Two-Phase Locking (2PL)](/en/ch8#sec_transactions_2pl)
- growing and shrinking phases, [Implementation of two-phase locking](/en/ch8#implementation-of-two-phase-locking)
- index-range locks, [Index-range locks](/en/ch8#sec_transactions_2pl_range)
- performance of, [Performance of two-phase locking](/en/ch8#performance-of-two-phase-locking)
- type checking, dynamic versus static, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility)
### U
- UDP (User Datagram Protocol)
- comparison to TCP, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- multicast, [Direct messaging from producers to consumers](/en/ch12#id296)
- Ultima Online (game), [Sharding](/en/ch7#ch_sharding)
- unbounded datasets, [Stream Processing](/en/ch12#ch_stream), [Glossary](/en/glossary)
- (see also streams)
- unbounded delays, [Glossary](/en/glossary)
- in networks, [Timeouts and Unbounded Delays](/en/ch9#sec_distributed_queueing)
- process pauses, [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- unbundling databases, [Unbundling Databases](/en/ch13#sec_future_unbundling)-[Multi-shard data processing](/en/ch13#sec_future_unbundled_multi_shard)
- composing data storage technologies, [Composing Data Storage Technologies](/en/ch13#id447)-[Unbundled versus integrated systems](/en/ch13#id448)
- federation versus unbundling, [The meta-database of everything](/en/ch13#id341)
- designing applications around dataflow, [Designing Applications Around Dataflow](/en/ch13#sec_future_dataflow)-[Stream processors and services](/en/ch13#id345)
- observing derived state, [Observing Derived State](/en/ch13#sec_future_observing)-[Multi-shard data processing](/en/ch13#sec_future_unbundled_multi_shard)
- materialized views and caching, [Materialized views and caching](/en/ch13#id451)
- multi-shard data processing, [Multi-shard data processing](/en/ch13#sec_future_unbundled_multi_shard)
- pushing state changes to clients, [Pushing state changes to clients](/en/ch13#id348)
- uncertain (transaction status) (see in doubt)
- union type (in Avro), [Schema evolution rules](/en/ch5#schema-evolution-rules)
- uniq (Unix tool), [Simple Log Analysis](/en/ch11#sec_batch_log_analysis), [Simple Log Analysis](/en/ch11#sec_batch_log_analysis), [Distributed Job Orchestration](/en/ch11#id278)
- uniqueness constraints
- asynchronously checked, [Loosely interpreted constraints](/en/ch13#id362)
- requiring consensus, [Uniqueness constraints require consensus](/en/ch13#id452)
- requiring linearizability, [Constraints and uniqueness guarantees](/en/ch10#sec_consistency_uniqueness)
- uniqueness in log-based messaging, [Uniqueness in log-based messaging](/en/ch13#sec_future_uniqueness_log)
- Unity (data catalog), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- universally unique identifiers (see UUIDs)
- Unix philosophy
- comparison to relational databases, [Unbundling Databases](/en/ch13#sec_future_unbundling), [The meta-database of everything](/en/ch13#id341)
- comparison to stream processing, [Processing Streams](/en/ch12#sec_stream_processing)
- Unix pipes, [Simple Log Analysis](/en/ch11#sec_batch_log_analysis)
- compared to distributed batch processing, [Scheduling Workflows](/en/ch11#sec_batch_workflows)
- UPDATE statement (SQL), [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility)
- updates
- preventing lost updates, [Preventing Lost Updates](/en/ch8#sec_transactions_lost_update)-[Conflict resolution and replication](/en/ch8#conflict-resolution-and-replication)
- atomic write operations, [Atomic write operations](/en/ch8#atomic-write-operations)
- automatically detecting lost updates, [Automatically detecting lost updates](/en/ch8#automatically-detecting-lost-updates)
- compare-and-set (CAS), [Conditional writes (compare-and-set)](/en/ch8#sec_transactions_compare_and_set)
- conflict resolution and replication, [Conflict resolution and replication](/en/ch8#conflict-resolution-and-replication)
- using explicit locking, [Explicit locking](/en/ch8#explicit-locking)
- preventing write skew, [Write Skew and Phantoms](/en/ch8#sec_transactions_write_skew)-[Materializing conflicts](/en/ch8#materializing-conflicts)
- utilization
- batch process scheduling, [Resource Allocation](/en/ch11#id279)
- increasing through preemption, [Handling Faults](/en/ch11#id281)
- trade-off with latency, [Can we not simply make network delays predictable?](/en/ch9#can-we-not-simply-make-network-delays-predictable)
- uTP protocol (BitTorrent), [The Limitations of TCP](/en/ch9#sec_distributed_tcp)
- UUIDs, [ID Generators and Logical Clocks](/en/ch10#sec_consistency_logical)
### V
- validity (consensus), [Single-value consensus](/en/ch10#single-value-consensus), [Atomic commitment as consensus](/en/ch10#atomic-commitment-as-consensus)
- vBuckets (sharding), [Sharding](/en/ch7#ch_sharding)
- vector clocks, [Version vectors](/en/ch6#version-vectors)
- (see also version vectors)
- and Lamport/hybrid logical clocks, [Lamport/hybrid logical clocks versus vector clocks](/en/ch10#lamporthybrid-logical-clocks-vs-vector-clocks)
- and version vectors, [Version vectors](/en/ch6#version-vectors)
- vector embedding, [Vector Embeddings](/en/ch4#id92)
- vectorized processing, [Query Execution: Compilation and Vectorization](/en/ch4#sec_storage_vectorized)
- vendor lock-in, [Pros and Cons of Cloud Services](/en/ch1#sec_introduction_cloud_tradeoffs)
- Venice (database), [Serving Derived Data](/en/ch11#sec_batch_serving_derived)
- verification, [Trust, but Verify](/en/ch13#sec_future_verification)-[Tools for auditable data systems](/en/ch13#id366)
- avoiding blind trust, [Don't just blindly trust what they promise](/en/ch13#id364)
- designing for auditability, [Designing for auditability](/en/ch13#id365)
- end-to-end integrity checks, [The end-to-end argument again](/en/ch13#id456)
- tools for auditable data systems, [Tools for auditable data systems](/en/ch13#id366)
- version control systems
- merge conflicts, [Manual conflict resolution](/en/ch6#manual-conflict-resolution)
- reliance on immutable data, [Concurrency control](/en/ch12#sec_stream_concurrency)
- version vectors, [Problems with different topologies](/en/ch6#problems-with-different-topologies), [Version vectors](/en/ch6#version-vectors)
- dotted, [Version vectors](/en/ch6#version-vectors)
- versus vector clocks, [Version vectors](/en/ch6#version-vectors)
- Vertica (database), [Cloud Data Warehouses](/en/ch4#sec_cloud_data_warehouses)
- handling writes, [Writing to Column-Oriented Storage](/en/ch4#writing-to-column-oriented-storage)
- vertical scaling (see scaling up)
- vertices (in graphs), [Graph-Like Data Models](/en/ch3#sec_datamodels_graph)
- property graph model, [Property Graphs](/en/ch3#id56)
- video games, [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- video transcoding (example), [Cross-channel timing dependencies](/en/ch10#cross-channel-timing-dependencies)
- views (SQL queries), [Datalog: Recursive Relational Queries](/en/ch3#id62)
- materialized views (see materialization)
- Viewstamped Replication (consensus algorithm), [Consensus](/en/ch10#sec_consistency_consensus), [Consensus in Practice](/en/ch10#sec_consistency_total_order)
- use of model-checking, [Model checking and specification languages](/en/ch9#model-checking-and-specification-languages)
- view number, [From single-leader replication to consensus](/en/ch10#from-single-leader-replication-to-consensus)
- virtual block device, [Separation of storage and compute](/en/ch1#sec_introduction_storage_compute)
- virtual file system, [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- comparison to distributed filesystems, [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- virtual machines, [Layering of cloud services](/en/ch1#layering-of-cloud-services)
- context switches, [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- network performance, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- noisy neighbors, [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- virtualized clocks in, [Clock Synchronization and Accuracy](/en/ch9#sec_distributed_clock_accuracy)
- virtual memory
- process pauses due to page faults, [Latency and Response Time](/en/ch2#id23), [Process Pauses](/en/ch9#sec_distributed_clocks_pauses)
- Virtuoso (database), [The SPARQL query language](/en/ch3#the-sparql-query-language)
- VisiCalc (spreadsheets), [Designing Applications Around Dataflow](/en/ch13#sec_future_dataflow)
- Vitess (database)
- key-range sharding, [Sharding by Key Range](/en/ch7#sec_sharding_key_range)
- vnodes (sharding), [Sharding](/en/ch7#ch_sharding)
- vocabularies, [Triple-Stores and SPARQL](/en/ch3#id59)
- Voice over IP (VoIP), [Network congestion and queueing](/en/ch9#network-congestion-and-queueing)
- VoltDB (database)
- cross-shard serializability, [Sharding](/en/ch8#sharding)
- deterministic stored procedures, [Pros and cons of stored procedures](/en/ch8#sec_transactions_stored_proc_tradeoffs)
- in-memory storage, [Keeping everything in memory](/en/ch4#sec_storage_inmemory)
- process-per-core model, [Pros and Cons of Sharding](/en/ch7#sec_sharding_reasons)
- secondary indexes, [Local Secondary Indexes](/en/ch7#id166)
- serial execution of transactions, [Actual Serial Execution](/en/ch8#sec_transactions_serial)
- statement-based replication, [Statement-based replication](/en/ch6#statement-based-replication), [Rebuilding state after a failure](/en/ch12#sec_stream_state_fault_tolerance)
- transactions in stream processing, [Atomic commit revisited](/en/ch12#sec_stream_atomic_commit)
### W
- WAL (write-ahead log), [Making B-trees reliable](/en/ch4#sec_storage_btree_wal)
- WAL-G (backup tool), [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- WarpStream (messaging), [Disk space usage](/en/ch12#sec_stream_disk_usage)
- web services (see services)
- webhooks, [Direct messaging from producers to consumers](/en/ch12#id296)
- webMethods (messaging), [Message brokers](/en/ch5#message-brokers)
- WebSocket (protocol), [Pushing state changes to clients](/en/ch13#id348)
- wide-column data model, [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality)
- versus column-oriented storage, [Column Compression](/en/ch4#sec_storage_column_compression)
- windows (stream processing), [Stream analytics](/en/ch12#id318), [Reasoning About Time](/en/ch12#sec_stream_time)-[Types of windows](/en/ch12#id324)
- infinite windows for changelogs, [Maintaining materialized views](/en/ch12#sec_stream_mat_view), [Stream-table join (stream enrichment)](/en/ch12#sec_stream_table_joins)
- knowing when all events have arrived, [Handling straggler events](/en/ch12#id323)
- stream joins within a window, [Stream-stream join (window join)](/en/ch12#id440)
- types of windows, [Types of windows](/en/ch12#id324)
- WITH RECURSIVE syntax (SQL), [Graph Queries in SQL](/en/ch3#id58)
- Word2Vec (language model), [Vector Embeddings](/en/ch4#id92)
- workflow engines, [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows)
- Airflow (see Airflow (workflow scheduler))
- batch processing, [Scheduling Workflows](/en/ch11#sec_batch_workflows)
- Camunda (see Camunda (workflow engine))
- Dagster (see Dagster (workflow scheduler))
- durable execution, [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows)
- ETL (see ETL (extract-transform-load))
- executor, [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows)
- orchestrators, [Durable Execution and Workflows](/en/ch5#sec_encoding_dataflow_workflows), [Batch Processing](/en/ch11#ch_batch)
- Orkes (see Orkes (workflow engine))
- Prefect (see Prefect (workflow scheduler))
- reliance on determinism, [Deterministic simulation testing](/en/ch9#deterministic-simulation-testing)
- Restate (see Restate (workflow engine))
- Temporal (see Temporal (workflow engine))
- working set, [Sorting Versus In-memory Aggregation](/en/ch11#id275)
- write amplification, [Write amplification](/en/ch4#write-amplification)
- write path (derived data), [Observing Derived State](/en/ch13#sec_future_observing)
- write skew (transaction isolation), [Write Skew and Phantoms](/en/ch8#sec_transactions_write_skew)-[Materializing conflicts](/en/ch8#materializing-conflicts)
- characterizing, [Write Skew and Phantoms](/en/ch8#sec_transactions_write_skew)-[Phantoms causing write skew](/en/ch8#sec_transactions_phantom), [Decisions based on an outdated premise](/en/ch8#decisions-based-on-an-outdated-premise)
- examples of, [Write Skew and Phantoms](/en/ch8#sec_transactions_write_skew), [More examples of write skew](/en/ch8#more-examples-of-write-skew)
- materializing conflicts, [Materializing conflicts](/en/ch8#materializing-conflicts)
- occurrence in practice, [Maintaining integrity in the face of software bugs](/en/ch13#id455)
- phantoms, [Phantoms causing write skew](/en/ch8#sec_transactions_phantom)
- preventing
- in snapshot isolation, [Decisions based on an outdated premise](/en/ch8#decisions-based-on-an-outdated-premise)-[Detecting writes that affect prior reads](/en/ch8#sec_detecting_writes_affect_reads)
- in two-phase locking, [Predicate locks](/en/ch8#predicate-locks)-[Index-range locks](/en/ch8#sec_transactions_2pl_range)
- options for, [Characterizing write skew](/en/ch8#characterizing-write-skew)
- write-ahead log (WAL), [Making B-trees reliable](/en/ch4#sec_storage_btree_wal), [Write-ahead log (WAL) shipping](/en/ch6#write-ahead-log-wal-shipping)
- in durable execution, [Durable execution](/en/ch5#durable-execution)
- writes (database)
- atomic write operations, [Atomic write operations](/en/ch8#atomic-write-operations)
- detecting writes affecting prior reads, [Detecting writes that affect prior reads](/en/ch8#sec_detecting_writes_affect_reads)
- preventing dirty writes with read committed, [No dirty writes](/en/ch8#sec_transactions_dirty_write)
- WS-\* framework, [The problems with remote procedure calls (RPCs)](/en/ch5#sec_problems_with_rpc)
- WS-AtomicTransaction (2PC), [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc)
### X
- X (social network)
- constructing home timelines (example), [Case Study: Social Network Home Timelines](/en/ch2#sec_introduction_twitter), [Deriving several views from the same event log](/en/ch12#sec_stream_deriving_views), [Table-table join (materialized view maintenance)](/en/ch12#id326), [Materialized views and caching](/en/ch13#id451)
- cost of joins, [Denormalization in the social networking case study](/en/ch3#denormalization-in-the-social-networking-case-study)
- describing load, [Describing Load](/en/ch2#id33)
- fault tolerance, [Fault Tolerance](/en/ch2#id27)
- performance metrics, [Describing Performance](/en/ch2#sec_introduction_percentiles)
- DistributedLog (event log), [Using logs for message storage](/en/ch12#id300)
- Snowflake (ID generator), [ID Generators and Logical Clocks](/en/ch10#sec_consistency_logical)
- XA transactions, [Two-Phase Commit (2PC)](/en/ch8#sec_transactions_2pc), [XA transactions](/en/ch8#xa-transactions)-[Problems with XA transactions](/en/ch8#problems-with-xa-transactions)
- heuristic decisions, [Recovering from coordinator failure](/en/ch8#recovering-from-coordinator-failure)
- problems with, [Problems with XA transactions](/en/ch8#problems-with-xa-transactions)
- xargs (Unix tool), [Simple Log Analysis](/en/ch11#sec_batch_log_analysis)
- XFS (file system), [Distributed Filesystems](/en/ch11#sec_batch_dfs)
- XGBoost (machine learning library), [Machine Learning](/en/ch11#id290)
- XML
- binary variants, [Binary encoding](/en/ch5#binary-encoding)
- data locality, [Data locality for reads and writes](/en/ch3#sec_datamodels_document_locality)
- encoding RDF data, [The RDF data model](/en/ch3#the-rdf-data-model)
- for application data, issues with, [JSON, XML, and Binary Variants](/en/ch5#sec_encoding_json)
- in relational databases, [Schema flexibility in the document model](/en/ch3#sec_datamodels_schema_flexibility)
- XML databases, [Relational Model versus Document Model](/en/ch3#sec_datamodels_history), [Query languages for documents](/en/ch3#query-languages-for-documents)
- Xorq (query engine), [The meta-database of everything](/en/ch13#id341)
- XPath, [Query languages for documents](/en/ch3#query-languages-for-documents)
- XQuery, [Query languages for documents](/en/ch3#query-languages-for-documents)
### Y
- Yahoo
- response time study, [Average, Median, and Percentiles](/en/ch2#id24)
- YARN (job scheduler), [Distributed Job Orchestration](/en/ch11#id278), [Separation of application code and state](/en/ch13#id344)
- ApplicationMaster, [Distributed Job Orchestration](/en/ch11#id278)
- Yjs (CRDT library), [Pros and cons of sync engines](/en/ch6#pros-and-cons-of-sync-engines)
- YugabyteDB (database)
- hash-range sharding, [Sharding by hash range](/en/ch7#sharding-by-hash-range)
- key-range sharding, [Sharding by Key Range](/en/ch7#sec_sharding_key_range)
- multi-leader replication, [Geographically Distributed Operation](/en/ch6#sec_replication_multi_dc)
- request routing, [Request Routing](/en/ch7#sec_sharding_routing)
- sharded secondary indexes, [Global Secondary Indexes](/en/ch7#id167)
- tablets (sharding), [Sharding](/en/ch7#ch_sharding)
- transactions, [What Exactly Is a Transaction?](/en/ch8#sec_transactions_overview), [Database-internal Distributed Transactions](/en/ch8#sec_transactions_internal)
- use of clock synchronization, [Synchronized clocks for global snapshots](/en/ch9#sec_distributed_spanner)
### Z
- Zab (consensus algorithm), [Consensus](/en/ch10#sec_consistency_consensus), [Consensus in Practice](/en/ch10#sec_consistency_total_order)
- use in ZooKeeper, [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable)
- zero-copy, [Formats for Encoding Data](/en/ch5#sec_encoding_formats)
- zero-disk architecture (ZDA), [Setting Up New Followers](/en/ch6#sec_replication_new_replica)
- ZeroMQ (messaging library), [Direct messaging from producers to consumers](/en/ch12#id296)
- zombies (split brain), [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens)
- zones (cloud computing) (see availability zones)
- ZooKeeper (coordination service), [Coordination Services](/en/ch10#sec_consistency_coordination)-[Service discovery](/en/ch10#service-discovery)
- generating fencing tokens, [Fencing off zombies and delayed requests](/en/ch9#sec_distributed_fencing_tokens), [Using shared logs](/en/ch10#sec_consistency_smr), [Coordination Services](/en/ch10#sec_consistency_coordination)
- linearizable operations, [Implementing Linearizable Systems](/en/ch10#sec_consistency_implementing_linearizable)
- locks and leader election, [Locking and leader election](/en/ch10#locking-and-leader-election)
- observers, [Service discovery](/en/ch10#service-discovery)
- use for service discovery, [Load balancers, service discovery, and service meshes](/en/ch5#sec_encoding_service_discovery), [Service discovery](/en/ch10#service-discovery)
- use for shard assignment, [Request Routing](/en/ch7#sec_sharding_routing)
- use of Zab algorithm, [Consensus](/en/ch10#sec_consistency_consensus)