Academic Year 2023-2024

DATABASES AND LABORATORY

Teachers

Angelo Montanari
Luca Geatti
Renato Acampora
Course Year
3
Unit Credits
12
Teaching Period
First Period
Course Type
Characterizing
Prerequisites. There are no mandatory prerequisites, but the student should know the fundamentals of programming, algorithms, data structures, and operating systems.
Teaching Methods. Classes will consist of lectures, discussions of exercises, and practical activities at a computer laboratory, devoted to the design, development, implementation, and analysis of a database. The laboratory and the exercises aim at showing to the students how to solve problems related to (i) database design and development; (ii) data definition, manipulation, and query; (iii) definition of index structures; (iv) transaction management. In the lab, the students will learn the basic functionality of a relational DBMS and will apply concepts, methods, and techniques illustrated in the lectures.
Verification of Learning. The students are required to take a written exam covering all the course topics, sometimes a viva voce test (which may be requested by the teacher or by the student), and a database project. As a rule, the oral examination begins with a discussion of the written exam revolving around the weak answers or the missing solutions to the proposed exercises, and it may extend to any course topic.
More Information. Teaching material, including exercises and data sets for lab activities, are available through the elearning platform provided by the university and the web pages of the professors.
Objectives
The overall aim of the course is to learn the concepts, tools and fundamental methodologies in the realm of database management, with a special emphasis on conceptual, logical, and physical modeling, on data definition and manipulation languages, and on database system architecture. The fundamental notions regarding definition and query languages are explained, as well as elements of conceptual modeling (from requirement elicitation and analysis to the construction of Entity-Relationship models), logical modeling (including schema restructuring, translation into the Relational Model, and data normalization), and physical modeling (indexing). Finally, the student will learn about transaction management and will acquire an understanding of the main components of a database management system. After this course, the student will be able to formalize into a relational language operations on data informally expressed in a natural language. The student will also be able to understand and develop conceptual database designs (from requirements to E-R diagrams), logical database designs (from the translation of E-R models into relational schemas to the formalization of relational semantics through functional dependencies and schema normalization), and physical database designs (definition of index data structures).

Abilities related to the specific discipline

1.1. Knowledge and understanding

Notion of “data model”.

Models for conceptual database design.

The Relational Model.

Formal languages for data definition and manipulation: Relational Algebra and Relational Calculus.

Architecture of a database management system.

Policies and mechanisms for concurrent transaction management.

Algorithms and data structures for efficient data indexing.

1.2. Applying knowledge and understanding

Elements of conceptual design: requirement analysis, Entity/Relationship diagrams.

Logical and physical database design.

The SQL language.

Stored procedures, user-defined functions, and triggers.

Case studies: PostgreSQL and MySQL.

Soft skills

2.1. Making judgments

Analyze and formalize the fundamental requirements of an information system.

Compare different solutions to a data management problem.

Correctly formulate complex queries.

Evaluate the opportunity of using indexes for query optimization.

2.2. Communication skills

Interact with (technical and non-technical) stakeholders involved in a data management project.

Convincingly argue about the decisions taken during all the phases of the design of an information system.

2.3. Learning skills

Independently learn how to use other database management software.

Delve into further topics in data management, such as XML databases, data warehouses, data security and privacy.

Tackle data management issues in specific applications, such as biological or spatio-temporal databases.

Contents
Basic concepts. Role and functionalities of a database; data abstractions; conceptual, logical, and physical data models; schemas and instances, logical and physical data independence; data definition and manipulation languages; Database Management Systems (DBMSs); database administrator and users; data dictionary; DBMS architecture.

The Entity/Relationship (ER) and relational models. Methodologies, techniques, and models for database design. The basic constructs of the ER model: entity types and instances, relation types and instances, attributes (simple/compound, single-valued/multiple-valued, optional, derived), keys (internal and external identifiers, weak entities); attribute domains and the use of NULL; participation constraints and cardinality ratio; recursive relations and roles; binary relations and relations with arity >= 2; ER diagrams; documentation of ER schemas; business rules (integrity constraints and derivation rules); advanced ER constructs: specialization and category. UML data modelling. The relational data model: basic notions, relations, constraints, relational database schemas and instances, update operations (insertion, deletion, modification) and possible constraint violations, constraint violation policies.

Database design. Conceptual design: requirement collection and analysis; criteria for data representation; design strategies; qualities of a conceptual schema; tools. Logical design: performance analysis on ER schemas; restructuring of ER schemas (redundancy analysis, elimination of specialization hierarchies, partitioning/ merging of entities/relations, choice of the primary identifier); mapping of ER schemas into relational ones. The logical design of relational databases: functional dependencies, relation decomposition (lossless join and preservation of functional dependencies), normal forms.

Relational algebra, relational calculi, and SQL. Relational algebra: basic operations, queries, derived and additional operations, algebraic optimization, the limitations of relational algebra; the relational calculus: basic notions, domain relational calculus and tuple relational calculus with range specification, the relationships between them, relational calculi and relational algebra. SQL: data definition and data types; queries; updates; additional data definition features (generic integrity constraints, views, specifying constraints as assertions); the DBMS PostgreSQL, notes on SQL and programming languages (triggers, procedures, and functions).

Physical organization of data. The storage of file records and the organization of primary files: introduction, the management of secondary storage, placing file records on disks, operations on files, files of unordered records (heap files) and of ordered records (sorted files), hashing techniques, other primary file organizations, efficient disk access by means of RAID technology. Indexing structures for files: single-level ordered indexes, static multilevel indexes, dynamic multilevel indexes (B-trees and B+-trees), other types of indexes.

Database server technology. Transactions: introduction to transaction processing, properties of transactions, transaction scheduling and recovery, serialization techniques, transaction support in SQL. Buffer manager: architecture, primitives, buffer management policies, relationships between buffer manager and file system. Concurrency control techniques: problems, architecture, anomalies of concurrent transactions, view-based and conflict-based techniques, two-phase locking techniques (2PL and strict 2PL), timestamp method, multiversioning-based techniques, granularity of data items.

Database recovery techniques: basic concepts, architecture, stable memory, log file, transaction processing, fault management. Query processing and optimization: system catalogs, query optimization (query internal representation, relation profiles, cost-based optimization), physical design of databases.

Texts
– Lecture notes.

– Slides and notes on the course web sites.

– One between the following two textbooks:

R. Elmasri, S. Navathe, Fundamentals of Database Systems (7th Edition), Pearson International Education / Addison Wesley, 2016.

P. Atzeni, S. Ceri, P. Fraternali, S. Paraboschi, R. Torlone, Basi di Dati (quinta edizione), McGraw-Hill, 2018.