Q80. Why is schema design for HBase is different from relational database design?
A80. HBase is a columnar NoSQL database. This means no two rows in a table need to have the same columns. In a columnar database table, each row contains cells containing
row-key (unique) + column-family + column qualifier + timestamp
Values are stored against each cell. Each “column family” will be stored in its own “HFile”. It is a best practice to not have too many column families.
Relational databases support ACID, whereas the NoSQL database like HBase supports BASE (Basically Available Soft state, Eventual consistency).
ACID compliant strict consistency means
When your bank balance is $50, and you deposit $100, no matter which ATM you query from immediately, your balance will be $150.00.
Eventual consistency means
in highly distributed systems like Hadoop, your data gets distributed to multiple servers, for example 100+ nodes. Your client apps can access any one of the servers to retrieve data. When someone append data to one of the servers, it takes time for that data to be replicated to all the nodes in the cluster. When a client application queries a server, it gets the most up to date data, but another client app querying a different server may get the old data as the recent write is not yet replicated to that server.
Eventually all the nodes will be up to date.
Structured Vs Unstructured schema & data
Relational databases are strictly structured, whereas the non-relational databases are unstructured. The concept of non-relational databases came into picture to handle rapid growth of unstructured data and to scale them out easily.
Variable schema where each row is slightly different:
RK1 key1=val1, key3=val3
RK2 key2=val2, key3=val3, key4=val4, key5=val5
RK1 is a row-key, and a row contains key/value pairs. Values are stored in files against a cell, so it is a best practice to keep column family & qualifier name as short as possible. In most cases the column families are represented as a single character. E.g. “a:id” to represent “account” column family with “id” column qualifier.
1. Table: name-space defined at design time.
2. Row: atomic key/value container with one row-key. gets periodically flushed to a disk. Each row is flushed to one or more files.
3. column-family: groups columns into physical files (HFile).
4. column qualifier: key in the key-value pair. Each row can have different keys. Columns can be defined at runtime.
5. Timestamp: long milliseconds, sorted in descending order. Serve as versions.
6. value: a time versioned value in the key-value container.
Q81. What is the purpose of “timestamp” in HBase?
A81. A row-key, column family and column qualifier form a cell that has a value and a timestamp that represents the value’s version.
All cell values are stored in a descending order by its timestamp so that if the timestamp is not provided then HBase will return the cell value with the most recent timestamp. If a timestamp is not specified during the write, the current timestamp is used.
The maximum number of versions for a given column qualifier or family can be stored as a part of the schema at table creation. It can be specified via alter table command as well. The default value is 1. You can also globally set up a maximum number of versions per column.
HBase does not overwrite row values. It stores different values per row by timestamp and column qualifier. Extra versions above the current max version setup are removed during major compactions.
Q82. Can you explain the concept of “query-first” schema design in HBase?
A82. This means all possible queries should be identified first, and then the schema model should be designed to cater for the usage patterns. This means the data that are read together are stored together.
When to use HBase?
1. If you plan to scan entire HBase table or majority of it, you are using HBase for wrong purpose.
2. Row key design is very important. If you are not using a filter against rowkey column in your query, your rowkey design may be wrong.
Scan scan = new Scan();
3. When you create external tables in Hive/Impala tables over HBase tables,ensure to map the HBase rowkey against a string column in Hive / Impala.
Normalize or De-normalize?
In a relational database, you normalize the schema to eliminate redundancy by putting repeating information into a table of its own, which has the benefits of no needing to update multiple copies of the same data and reduction of the storage sizes. Normalization requires joins as data need to be retrieved from multiple tables, hence the queries can take longer to execute. Often in HBase you de-normalize or duplicate data so that the data can be read and stored together. De-normalization can be considered as a replacement for joins.
one-to-many relationships can be modelled in HBase tables as a single row when the only way you get at the child entities is via the parent entity.
Example: Say an ACCOUNT has many TRANSACTION(s).
1) Create an “ACCOUNT” table.
2) Create a “TXNS” column family and add columns at run time like TXNS:1:$150000.00, TXNS:2:3000.50, etc.
The usage pattern is: request all the transactions for an account. No additional look ups are required.
If you need to frequently run queries on your transactions, then you may want to design your schema as
1) Create an TXN table to store transaction details.
2) Create a separate ACCOUNT table and then TXNS column family to store only the transaction ids.
In the above approach to get transaction details for an account, you need to perform additional lookups in the TXN table. The join logic will be implemented in your Spark or Map Reduce application code.
Q83. How does HBase scale?
A83. HBase scales by splitting all rows into regions. Each region is hosted by one server. Each region has its own memstores. Writes are sorted lexicographically by “row key” and held in memory until flushed. Reads merge rows in memory with flushed files (i.e. HFile). Reads and writes to a single row are consistent since a single “Region” always manage an entire row.
Multiversioning helps to avoid “edit conflicts” caused by concurrent data access and to retain data for whatever time it is needed as long as enough storage is provided.
When data schemas are properly designed, HBase provides excellent random read performance and near-optimal write operations in terms of I/O. Also makes an efficient use of storage by supporting pluggable compression algorithms.
Q84. Why are records stored lexicographically by the row-key?
A84. Records in HBase are sorted lexicographically by the row-key as this allows fast access to an individual record by its row-key and fast fetching of a range of data given start and stop row-keys.
Q85. What is a region server “hotspotting” problem ?
A85. This problem is caused by bad “row-key” design. When you write records to an HBase table with “sequential row keys“, it causes undesirable region server hotspotting. Sequential row keys start with:
1) Timestamps: new Date().getTime();
2) Increasing or decreasing sequence numbers: 001, 002, 003, …. OR 999, 998, 997 …..
Sequential keys are very efficient for reading a range of data with start and stop keys, for example finding records for a time range, but very inefficient for writing as records with sequential keys hit the same region that lives on just one Region Server as writes are sorted lexicographically by row-key. Each Region has a pre-configured maximum size, so after a Region reaches that size it is split into two smaller Regions. So, a single region that takes all the records causes the hotspot issue, and adversely impacts the write throughput.
How to solve the hotspot issue?
The easiest way to prevent a single Region server hotspotting is by distributing the writes over multiple Regions by using random row-keys. This approach can adversely impact fast range scans using start and stop keys. You can get the best of both worlds by designing your row-keys prefixed by randomness, and then followed by a sequential number. This will allow the records to be split into multiple regions on different region servers, and the records in each region server will preserve the sequence order. One such technique is called “salting the original row-key”.
original-row-key = account_id + txn_id
salted-row-key = (++index % BUCKETS_NUMBER) + original-row-key
When you start writing into an empty table, you’ll be writing a single region and have to wait some time before you have several regions depending on the volume and velocity of incoming data, compression, and maximum region size.
Q86. What are the different approaches to design row-keys prevent hotspotting?
A86. 1) Salting 2) Hashing 3) Reversing the key.
Prefixing the key with a random value as explained above.
Creating a one-way hash that would cause a given row to be always salted with the same prefix.
Reversing the key
Reverse a sequential row-key so that the part that changes most often (i.e. lower order bytes or least significant digits) are at the front. This randomizes the row-keys. E.g. milliseconds of the timestamp at the front.
Q87. If you have the following entities, what are the different ways you can design the schemas?
Account: name, category, balance.
Transaction: id, name, amount.
A87. There are different ways to design the schemas.
#1. Store as JSON string in a single column qualifier
1) One table with two column families: “a” for account details, and t” for transaction details.
2) Each column family will have one column to store all the account and transaction details as a JSON string.
Merit: Less space, and to get all the transactions for a account, a single “Get” query.
De-merit: JSON string has to be parsed for every query, which may slow down reading of complex queries.
#2. Stores in multiple column families
1) One table with one column-family for the account fields, and a separate column-family for each transaction fields: id, name, and amount.
2) Each transaction column-family will have multiple columns as index 1,2,3… to store transaction id1, id2, id3…, name1, name2, name3, …. and amount1, amount2, amount3, …. in the same order. Null attributes need to be stored as well to preserve the corresponding order.
Merit: fast read for different types of queries.
De-merit: High space requirements due to large number of column-families, and null fields need to be stored as well. Performance also gets impacted by more number of column families.
#3. Store in partially normalized tables
1) Two tables. One for “account” details and relevant non null transaction ids. The second table is a look up “transaction” table for transaction details. Store only non null fields. The row-key for the “transaction” table is transaction id.
2) The account table has two column families, one for account details, and the other for the transaction ids 1,2,3. In order to get the transaction information, you have to look into the separate “transaction” table by transaction id.
Merit: Using a “Get” we can get all transaction ids for a account.
De-merit: We need to perform extra lookups into transaction table to query on transaction attributes like name or amount. An efficient way for doing in such case is bulk get.
#4. Create composite row-keys
1) This schema contains a single table with row id as a composite key of various ids.For example, row id can be “account id + transaction id”.
2) All non null fields of a transaction can be stored into a single column-family to prevent separate look ups. In order to access all transactions of a single account you can perform a range scan like “account123_*”