Keep in mind: sequence read is much faster than random read
Because it minimizes the number of seek operations (moving the read/write head to a specific location on the storage device) and maximizes the amount of data transferred per operation
What is index ?
Indexes are entry points for tables
Index used to locate the tuples in the table
Index stored separately from table
Create Index :
Get physics file name of index:
Find index file location:
Index types?
B-Tree Index:
A B-Tree index is composed of multiple levels of pages, where each page can store a number of index entries. The index entries are sorted by the values of the indexed column or columns, and each entry also contains a pointer to the location of the corresponding row in the table.
Supported Operators :
Advantages :
A B-Tree index can handle various types of queries, such as equality, range, pattern matching, null checks, and sorting.
A B-Tree index is efficient and balanced, meaning that it does not waste space or become skewed.
A B-Tree index can support unique constraints and primary keys, by ensuring that no two index
entries have the same value.
Limitations :
A B-Tree index cannot store values that are too large or complex, such as arrays or JSON objects.
A B-Tree index cannot support queries that involve operators other than those defined by its operator class.
A B-Tree index may require maintenance and reorganization to keep its performance optimal.
Hash Index
A Hash index is a type of index that PostgreSQL uses to store and retrieve data that can only be compared for equality. A Hash index stores a 32-bit hash code derived from the value of the indexed column, and a pointer to the location of the corresponding row in the table.
Supported Operators :
A Hash index can only handle queries that use the equal operator: =
Advantages:
A Hash index is smaller and faster than a B-Tree index for equality queries, because it does not need to store or compare the actual values of the indexed column.
A Hash index can handle any data type that has a hash function defined, even if the data type does not have a natural ordering.
A Hash index can support unique constraints and primary keys, by ensuring that no two index entries have the same hash code.
Limitations:
A Hash index cannot support queries that involve operators other than equality, such as range, pattern matching, or sorting.
A Hash index cannot store values that are too large or complex to be hashed, such as arrays or JSON objects.
A Hash index may require maintenance and reorganization to keep its performance optimal, especially if the data distribution changes over time.
BRIN Index
BRIN stands for Block Range Index, which means that the index stores summary information about a group of pages that are adjacent to each other in the table. For example, a table that has a column for the date of an order might have the rows sorted by date, so that the earlier orders are located earlier in the table. A BRIN index can store the minimum and maximum date values for each block range, and use them to quickly find the rows that match a given date or range of dates..
Supported Operators:
Range selection
Advantages:
A BRIN index is very small and fast, because it does not need to store or compare the actual values of the indexed column, only the summary information for each block range.
A BRIN index can handle any data type that has a linear sort order, even if the data type does not have a natural ordering.
A BRIN index can support unique constraints and primary keys, by ensuring that no two block ranges have the same summary values.
Limitations:
A BRIN index cannot support queries that involve operators other than equality or range, such as pattern matching or sorting.
A BRIN index cannot store values that are too large or complex to be summarized, such as arrays or JSON objects.
A BRIN index may require maintenance and reorganization to keep its performance optimal, especially if the data distribution changes over time.
GIN Index
A GIN index is a type of index that PostgreSQL uses to store and retrieve data that can be decomposed into multiple components, such as arrays, JSON objects, or full text documents. GIN stands for Generalized Inverted Index, which means that the index stores a mapping from each component value (also called a key) to a list of locations (also called postings) where the value appears in the table.
Supported Operators:
Advantages:
A GIN index is very fast and accurate for queries that involve multiple keys, because it can use the intersection or union of the postings lists to find the matching rows.
A GIN index can support any data type that has a GIN operator class defined, which allows for extensibility and customization.
A GIN index can support unique constraints and primary keys, by ensuring that no two rows have the same set of keys.
Limitations:
A GIN index is larger and slower to update than other types of indexes, because it needs to store and maintain a lot of information for each key.
A GIN index cannot support queries that involve operators other than those defined by its operator class, such as range, pattern matching, or sorting.
A GIN index may require maintenance and reorganization to keep its performance optimal, especially if the data distribution changes over time.
GiST Index
A GiST index is a type of index that PostgreSQL uses to store and retrieve data that can be represented as a geometric shape, such as points, lines, polygons, or circles. GiST stands for Generalized Search Tree, which means that the index can support any data type that has a consistent way of defining the relationship between two values, such as overlap, contain, or distance.
Supported Operators:
Advantages:
A GiST index is very flexible and extensible, because it can support any data type that has a GiST operator class defined, which allows for customization and innovation.
A GiST index is very efficient and scalable, because it uses a balanced tree structure that adapts to the data distribution and minimizes the disk access.
A GiST index can support unique constraints and primary keys, by ensuring that no two values have the same geometric shape.
Limitations:
A GiST index is lossy, meaning that the index may produce false matches, and it is necessary to check the actual table row to eliminate such false matches. (PostgreSQL does this automatically when needed.) GiST indexes are lossy because each value is represented in the index by a bounding box that approximates its shape.
A GiST index cannot support queries that involve operators other than those defined by its operator class, such as equality, range, pattern matching, or sorting.
A GiST index may require maintenance and reorganization to keep its performance optimal, especially if the data distribution changes over time. time.