Chenyo's org-static-blog

17 Jul 2024

CMU 15-445 notes: Relational Model & Algebra

This is a personal note for the CMU 15-445 L1 video and CMU 15-445 L1 notes, along with some terminology explained by Claude.ai.

1. Terminology

1.1. Database

  • An organized collection of inter-related data that models some aspect of the real-world.

1.2. Database design consideration

  • Data integrity: protect invalid writing.
  • Implementation: query complexity, concurrent query.
  • Durability: replication, fault tolerance.

1.3. Database management system (DBMS)

  • A software that manages a database.
  • Allow the definition, creation, query, update and administration of databases.

1.4. Data model

  • A conceptual, high-level representation of how data is structured
  • Defines entities, attributes, relationships between entities and constraints.

1.5. Schema

  • A concrete implementation of a data model.
  • Defines tables, fields, data types, keys and rules.
  • Typically represented by a specific database language.

1.6. Entities and Tables

  • Entities: conceptual representations of objects in the logical data model.
  • Tables: physical storage structures in the physical data model.

1.7. Attributes and Fields

  • Attributes: properties of an entity.
  • Fields: columns in a database table.

1.8. Logical layer

  • The entities and attributes the database has.

1.9. Physical layer

  • How are entities and attributes stored in the database.

1.10. Data manipulation languages (DMLs)

  • Methods to store and retrieve information from a database.
  • Procedural: the query specifies the (high-level) strategy the DBMS should use to get the results, e.g., with relational algebra.
  • Declarative: the query specifies only what data is desired but not how to get it, e.g., with relational calculus (a formal language).

1.11. SQL (Structured Query Language) and relational model

  • SQL implements the relational model in DBMS and provides a standard way to create, manipulate and query relational databases.
  • Different SQL implementation may vary and do not strictly adhere to the relational model, e.g., allow duplicate rows.

2. Relational model

  • A data model that defines a database abstraction to avoid maintenance overhead when changing the physical layer.
  • Data is stored as relations/tables.
  • Physical layer implementation and execution strategy depends on DBMS implementation.
091318_0803_RelationalD1.png
Figure 1: Relational model concepts (Source)

2.1. A relation

  • An unordered set that contains the relationship of attributes that represent entities.
  • Relationships are unordered in the relation.

2.2. A domain

  • A named set of allowable values for a specific attribute.

2.3. A tuple

  • A set of attribute values in the relation.
  • Values can also be lists or nested data structures.
  • Null: a special value in any attribute which means the attribute in a tuple is undefined.
  • \(n-ary\): a relation with \(n\) attributes.

2.4. Keys

  • Primary key: uniquely identifies a single tuple.
  • Foreign key: specifies that an attribute (e.g., CustomerID) in one relation (e.g., OrderTable) has to map to a tuple (e.g., the tuple with the same CustomerID) in another relation (e.g., CustomerTable).

3. Relational Algebra

  • A set of fundamental operations to retrieve and manipulate tuples in a relation.
  • Each operator takes in one or more relations as inputs, and outputs a new relation; operators can be chained.
  • Is a procedure language, meaning the execution always follow the query, even there exists more efficient way to get the same result; A better way is to be more declarative, e.g., SQL’s where syntax.
  • Common relational algebra.
Tags: study database cmu