Partial Indexes [CORE3384] #3750

firebird-automations · 2011-03-14T19:33:21Z

Submitted by: Vladimir Arkhipov (arkinform)

Votes: 4

Example:

CREATE UNIQUE INDEX orders_number_idx ON orders (number)
WHERE number is not null

Description from http://www.postgresql.org/docs/8.0/static/indexes-partial.html

"A partial index is an index built over a subset of a table; the subset is defined by a conditional expression (called the predicate of the partial index). The index contains entries for only those table rows that satisfy the predicate."

"A major motivation for partial indexes is to avoid indexing common values. Since a query searching for a common value (one that accounts for more than a few percent of all the table rows) will not use the index anyway, there is no point in keeping those rows in the index at all. This reduces the size of the index, which will speed up queries that do use the index. It will also speed up many table update operations because the index does not need to be updated in all cases."

Very very useful !!!

firebird-automations · 2011-03-14T19:34:02Z

Modified by: Vladimir Arkhipov (arkinform)

description: Example:

CREATE UNIQUE INDEX orders_number_idx ON orders (number)
WHERE number is not null

Description from http://www.postgresql.org/docs/8.0/static/indexes-partial.html:

"A partial index is an index built over a subset of a table; the subset is defined by a conditional expression (called the predicate of the partial index). The index contains entries for only those table rows that satisfy the predicate."

"A major motivation for partial indexes is to avoid indexing common values. Since a query searching for a common value (one that accounts for more than a few percent of all the table rows) will not use the index anyway, there is no point in keeping those rows in the index at all. This reduces the size of the index, which will speed up queries that do use the index. It will also speed up many table update operations because the index does not need to be updated in all cases."

Very very useful !!!

=>

Example:

CREATE UNIQUE INDEX orders_number_idx ON orders (number)
WHERE number is not null

Description from http://www.postgresql.org/docs/8.0/static/indexes-partial.html

"A partial index is an index built over a subset of a table; the subset is defined by a conditional expression (called the predicate of the partial index). The index contains entries for only those table rows that satisfy the predicate."

"A major motivation for partial indexes is to avoid indexing common values. Since a query searching for a common value (one that accounts for more than a few percent of all the table rows) will not use the index anyway, there is no point in keeping those rows in the index at all. This reduces the size of the index, which will speed up queries that do use the index. It will also speed up many table update operations because the index does not need to be updated in all cases."

Very very useful !!!

firebird-automations · 2011-03-14T20:25:49Z

Commented by: @ibaseru

Have doubts about this feature since it was implemented in FoxPro.
I understand that it will be nice for the former FoxPro users, but...

Less space used by index - yes and good. Speedup queries? No. The amount of index pages being read will not be much less than in ordinary index. Histograms are better, mostly knowing that data may change, and such partial index must be rebuilt with new condition to be used in new conditions.

Well, maybe I'm wrong, but I'd like to see more theoretical examples (close to practics) with benefits of indices like this. Examples from PostgreSQL looks too childish.

firebird-automations · 2011-03-14T22:43:04Z

Commented by: Ann Harrison (awharrison)

Firebird differs from Postgres in a couple of ways that affect this
request. One is that Firebird indexes use prefix compression, so
a duplicate index entry has a data length of zero, so common values
are very compact.

The other is that Firebird does not keep multiple index entries for
the same value in the same record. In Postgres, each version of
a record has an entry in every index - so if you modify a non-indexed
field in a record with three indexes defined, you get three new index
entries for that record, each with the same values for the key as
the entries for the older record version. In Firebird, updating a non-
indexed field doesn't affect any indexes. Updating one of three
indexed fields affects at most one index.- if the new value already
existed in an older version, it won't be re-entered.

firebird-automations · 2011-03-15T00:49:06Z

Commented by: Dirk Rothe (drothe)

Real World:
We have a Field that ist mostly Null, but very selective if filled. A lot of queries filter the records by applying the where-condition "Field is Null". A minor number queries locate records by using the Index on this Field.

In query-plans for the the "Field is Null" case, the index of this Field is choosen (because of a higher selectivity) before other "better indexes". We work around this problem by trying to remember to disable the index with "Field+0 is Null" in these cases. This speeds up these queries by some order of magnitude. But from time to time somebody forgets this "trick" - so I would really like to fix this problem by excluding some Values from the Index.

firebird-automations · 2011-03-15T07:11:49Z

Commented by: @dyemanov

Dirk, in this case a value distribution histogram is your friend, as with its existence the optimizer will not use an index for FIELD IS NULL while still be using an index for equality comparisons.

firebird-automations · 2011-03-15T15:43:39Z

Commented by: Vladimir Arkhipov (arkinform)

I see 2 applications of partial indexes:

1. Restrict unique values for not null fields. For example, number of case history for patient in medicine. The number may be not exist yet. But not null number must be unique. I don't know how to do it "gracefully" without partial indexes.

CREATE UNIQUE INDEX patients_histnum_idx ON patients (histnum)
WHERE histnum is not null

2. Large tables with fields where the most of values is null. Patient attribute, for example, allergy (or something like that) where 3-5% of values filled. We never need to find patients without allergy, but we often need to know about the presence of allergy for patients. I understand that Firebird store indexes very compact, but it is still unnecessary time and disk space.

P.S. What about speed of garbage collection for indexes where 99% of values is null? Is it problem or no?

firebird-automations · 2011-03-15T16:00:59Z

Commented by: @dyemanov

As for your case (1), I'm not sure I get it. Firebird allows you to have any number of NULLs inside a uniquely constrained column. What is the problem?

firebird-automations · 2011-03-15T16:01:33Z

Modified by: @dyemanov

Component: Engine [ 10000 ]

Version: 3.0 Alpha 1 [ 10331 ] =>

firebird-automations · 2011-03-15T16:24:13Z

Commented by: Vladimir Arkhipov (arkinform)

Unfortunately I did not know about this feature. Thank you :)

firebird-automations · 2011-03-15T16:36:15Z

Commented by: Vladimir Arkhipov (arkinform)

What about case (2) and speed of garbage collection? Does this affect garbage collection in principle?

firebird-automations · 2011-03-15T22:05:15Z

Commented by: Dirk Rothe (drothe)

Are "value distribution histograms" an expected 3.0 feature? I haven't that found that much information by googling for it.

firebird-automations · 2011-03-16T08:25:32Z

Commented by: @dyemanov

Vladimir: yes, these other points seem to be valid, although the cost of garbage collection in duplicated chains decreased since ODS11.
Dirk: yes, it is scheduled for v3.0, see CORE1686.

firebird-automations · 2011-03-16T11:20:08Z

Commented by: Dirk Rothe (drothe)

Dmitry: Very nice, and a big thank you to all involved!

firebird-automations · 2011-03-16T18:39:33Z

Commented by: Sean Leyne (seanleyne)

My reading of this case is that this issue can be closed, since there is no new action/change which is required. Correct?

firebird-automations · 2011-03-16T18:59:37Z

Commented by: @dyemanov

I'm not sure. Some benefits of this feature still exist, the question is primarily whether they justify the development cost. But regardless, the priority should be decreased to "minor".

firebird-automations · 2018-08-03T11:16:20Z

Commented by: @livius2

If you still need real world sample:

table REPORT with column STATE (something like Firebird Tracker)

values of state:
closed - common value e.g. 1000000 records
open - e.g. 200 records
reported - e.g. 800 records

and you then can create partial index:

CREATE INDEX IXAP_REPORTS_STATE ON REPORT (STATE)
WHERE STATE<>'closed'

this index is small and is used by common queries

firebird-automations · 2019-05-28T15:33:06Z

Commented by: @livius2

If this feature will be done
and it can be then used in referential constraints, then benefit will be much bigger

Example

-----------------------------------------------------
instead of tables NOTES_CUSTOMER, NOTES_PERSON, NOTES_THING ...

-----------------------------------------------------
or one table with many ids

CRATE TABLE NOTES
(
ID BIGINT NOT NULL PRIMARY KEY

, ID_CUSTOMER BIGINT NOT NULL
, ID_PERSON BIGINT NOT NULL
, ID_THING BIGINT NOT NULL
... other refids ...

,REF_TYPE INTEGER NOT NULL
DESCRIPTION VARCHAR(200)
);

-----------------------------------------------------
we can do this:
CRATE TABLE NOTES
(
ID BIGINT NOT NULL PRIMARY KEY
, ID_REF BIGINT NOT NULL
,REF_TYPE INTEGER NOT NULL
DESCRIPTION VARCHAR(200)
);

ALTER TABLE NOTES ADD CONSTRAINT FK_NOTES__CUSTOMER FOREIGN KEY(ID_REF) WHERE REF_TYPE=1 REFERENCES CUSTOMER(ID) ON DELETE NO ACTION ON UPDATE CASCADE;
ALTER TABLE NOTES ADD CONSTRAINT FK_NOTES__PERSON FOREIGN KEY(ID_REF) WHERE REF_TYPE=2 REFERENCES PERSON(ID) ON DELETE NO ACTION ON UPDATE CASCADE;
ALTER TABLE NOTES ADD CONSTRAINT FK_NOTES__THING FOREIGN KEY(ID_REF) WHERE REF_TYPE=3 REFERENCES THING(ID) ON DELETE NO ACTION ON UPDATE CASCADE;

This is maybe not ideal design - but sometimes this will simplify many problems

justin-fyfe · 2021-07-27T22:10:05Z

+1 if this feature can be done. Not for performance but for unique enforcement. For example, consider the table:

CREATE TABLE FOO_TBL (
     ID    UUID NOT NULL,
     USER_ID UUID NOT NULL,
     UNIQUE_VALUE VARCHAR(256) NOT NULL, -- THE UNIQUE VALUE
     START_TS TIMESTAMP NOT NULL, -- THE TIME THE VALUE BECAME ACTIVE
     STOP_TS TIMESTAMP, -- THE TIME THAT THE VALUE WAS NO LONGER ACTIVE
     CONSTRAINT PK_FOO PRIMARY KEY (ID),
     CONSTRAINT FK_FOO_USER FOREIGN KEY (USER_ID) REFERENCES USER_TBL(USER_ID)
);

Basically a table which needs to track changes over time of a value (can be anything like an medical record number, etc.) but basically the business requirement is: value must be unique if it is active.

In PostgreSQL, Oracle, and even SQLite this is done via:

CREATE UNIQUE INDEX FOO_UQ_IDX ON FOO_TBL(UNIQUE_VALUE) WHERE (STOP_TS IS NULL);

i.e. Enforce the unique constraint on UNIQUE_VALUE if it has not been indicated as no longer valid.

If there is a way to do this other than indexes in Firebird I'd love an example - currently we've just removed unique checking on this value when users use Firebird, it would be nice if our Firebird provider could behave the same way as our Postgres, SQLite and Oracle providers :)

hvlad · 2021-07-27T22:23:54Z

You may use expression index right now:

CREATE UNIQUE INDEX FOO_UQ_IDX ON FOO_TBL COMPUTED BY (CASE WHEN STOP_TS IS NULL THEN UNIQUE_VALUE END)

pavel-zotov · 2023-02-18T11:37:22Z

QA: see tests for #7257

firebird-automations added priority: major component: engine type: improvement labels Apr 25, 2021

dyemanov self-assigned this Aug 12, 2022

dyemanov mentioned this issue Aug 12, 2022

Support for partial indices #7257

Merged

livius2 mentioned this issue Aug 13, 2022

Partial Foreign Keys #7260

Open

dyemanov added type: new feature fix-version: 5.0 Beta 1 and removed type: improvement labels Oct 19, 2022

dyemanov changed the title ~~Partial Indexes (not partitioning) [CORE3384]~~ Partial Indexes [CORE3384] Oct 19, 2022

dyemanov closed this as completed Oct 19, 2022

pavel-zotov added qa: done with caveats qa: covered by another tests and removed qa: done with caveats labels Feb 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial Indexes [CORE3384] #3750

Partial Indexes [CORE3384] #3750

firebird-automations commented Mar 14, 2011

firebird-automations commented Mar 14, 2011

firebird-automations commented Mar 14, 2011

firebird-automations commented Mar 14, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 16, 2011

firebird-automations commented Mar 16, 2011

firebird-automations commented Mar 16, 2011

firebird-automations commented Mar 16, 2011

firebird-automations commented Aug 3, 2018

firebird-automations commented May 28, 2019

justin-fyfe commented Jul 27, 2021

hvlad commented Jul 27, 2021

pavel-zotov commented Feb 18, 2023

Partial Indexes [CORE3384] #3750

Partial Indexes [CORE3384] #3750

Comments

firebird-automations commented Mar 14, 2011

firebird-automations commented Mar 14, 2011

firebird-automations commented Mar 14, 2011

firebird-automations commented Mar 14, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 15, 2011

firebird-automations commented Mar 16, 2011

firebird-automations commented Mar 16, 2011

firebird-automations commented Mar 16, 2011

firebird-automations commented Mar 16, 2011

firebird-automations commented Aug 3, 2018

firebird-automations commented May 28, 2019

justin-fyfe commented Jul 27, 2021

hvlad commented Jul 27, 2021

pavel-zotov commented Feb 18, 2023