PostgreSQL中query_planner主計劃函數(shù)make_one_rel的實現(xiàn)邏輯

這篇文章主要介紹PostgreSQL中query_planner主計劃函數(shù)make_one_rel的實現(xiàn)邏輯,文中介紹的非常詳細,具有一定的參考價值,感興趣的小伙伴們一定要看完!

創(chuàng)新互聯(lián)建站成都企業(yè)網(wǎng)站建設(shè)服務(wù),提供網(wǎng)站設(shè)計、成都網(wǎng)站建設(shè)網(wǎng)站開發(fā),網(wǎng)站定制,建網(wǎng)站,網(wǎng)站搭建,網(wǎng)站設(shè)計,響應(yīng)式網(wǎng)站設(shè)計,網(wǎng)頁設(shè)計師打造企業(yè)風(fēng)格網(wǎng)站,提供周到的售前咨詢和貼心的售后服務(wù)。歡迎咨詢做網(wǎng)站需要多少錢:18982081108

一、Paths and Join Pairs

Paths and Join Pairs
訪問路徑和連接對

During the planning/optimizing process, we build "Path" trees representing
the different ways of doing a query.  We select the cheapest Path that
generates the desired relation and turn it into a Plan to pass to the
executor.  (There is pretty nearly a one-to-one correspondence between the
Path and Plan trees, but Path nodes omit info that won't be needed during
planning, and include info needed for planning that won't be needed by the
executor.)

在計劃/優(yōu)化過程中,通過構(gòu)建"Path"樹表示執(zhí)行查詢的不同方法,在這些方法中,選擇生成所需Relation的最低成本路徑(Path結(jié)構(gòu)體)并轉(zhuǎn)換為Plan結(jié)構(gòu)體傳遞給執(zhí)行器.(Path和Plan樹兩者之間幾乎存在一對一的對應(yīng)關(guān)系,但Path節(jié)點省略了在計劃期間不需要但包含了在執(zhí)行期間不需要而在計劃期間需要的信息)

The optimizer builds a RelOptInfo structure for each base relation used in
the query.  Base rels are either primitive tables, or subquery subselects
that are planned via a separate recursive invocation of the planner.  A
RelOptInfo is also built for each join relation that is considered during
planning.  A join rel is simply a combination of base rels.  There is only
one join RelOptInfo for any given set of baserels --- for example, the join
{A B C} is represented by the same RelOptInfo no matter whether we build it
by joining A and B first and then adding C, or joining B and C first and
then adding A, etc.  These different means of building the joinrel are represented as Paths.  For each RelOptInfo we build a list of Paths that
represent plausible ways to implement the scan or join of that relation.
Once we've considered all the plausible Paths for a rel, we select the one
that is cheapest according to the planner's cost estimates.  The final plan
is derived from the cheapest Path for the RelOptInfo that includes all the
base rels of the query.

優(yōu)化器為查詢中使用到的每一個基礎(chǔ)關(guān)系(base relation)創(chuàng)建RelOptInfo結(jié)構(gòu)體.基礎(chǔ)關(guān)系(base relation)包括原始表(primitive table),子查詢(通過獨立的計劃器遞歸調(diào)用生成計劃)以及參與連接的Relation.連接生成的關(guān)系(join rel)是基礎(chǔ)關(guān)系的組合(可以理解為通過連接運算生成的新關(guān)系).對于給定的基礎(chǔ)關(guān)系集合,只有一個連接的RelOptInfo結(jié)構(gòu)體生成,而這個RelOptInfo如何生成(比如基礎(chǔ)關(guān)系集合{A B C},A和B可以先連接然后與C連接,當然也可以B和C先連接然后在與A連接)并不關(guān)心.這些構(gòu)建join rel的方法通過Paths表示.對每一個RelOptInfo,優(yōu)化器會構(gòu)建Paths鏈表來表示這些"貌似有理的"實現(xiàn)掃描或者連接的方法.對于這些所有可能的路徑,優(yōu)化器會根據(jù)成本估算選擇成本最低的訪問路徑.最后的執(zhí)行計劃從RelOptInfo(最終生成的新關(guān)系)成本最低的訪問路徑產(chǎn)生.

Possible Paths for a primitive table relation include plain old sequential
scan, plus index scans for any indexes that exist on the table, plus bitmap
index scans using one or more indexes.  Specialized RTE types, such as
function RTEs, may have only one possible Path.

訪問原始表的可能路徑包括普通的順序掃描/基于索引的索引掃描/位圖索引掃描.對于某些特殊的RTE類型,比如函數(shù)RTEs,可能只有一種可能的路徑.

Joins always occur using two RelOptInfos.  One is outer, the other inner.
Outers drive lookups of values in the inner.  In a nested loop, lookups of
values in the inner occur by scanning the inner path once per outer tuple
to find each matching inner row.  In a mergejoin, inner and outer rows are
ordered, and are accessed in order, so only one scan is required to perform
the entire join: both inner and outer paths are scanned in-sync.  (There's
not a lot of difference between inner and outer in a mergejoin...)  In a
hashjoin, the inner is scanned first and all its rows are entered in a
hashtable, then the outer is scanned and for each row we lookup the join
key in the hashtable.

連接通常出現(xiàn)在兩個RelOptInfo之間,俗稱外部表和內(nèi)部表,其中外部表又稱驅(qū)動表.在Nested Loop連接方式,對于外部的每一個元組,都會訪問內(nèi)部表以掃描滿足條件的數(shù)據(jù)行.Merge Join連接方式,外部表和內(nèi)部表的元組已排序,順序訪問外部表和內(nèi)部表的每一個元組即可,這種方式只需要同步掃描一次.Hash Join連接方式,首先會掃描內(nèi)部表并建立HashTable,然后掃描外部表,對于外部表的每一行掃描哈希表找出匹配行.

A Path for a join relation is actually a tree structure, with the topmost
Path node representing the last-applied join method.  It has left and right
subpaths that represent the scan or join methods used for the two input
relations.

join rel的訪問路徑實際上是一種樹狀結(jié)構(gòu),最頂層的路徑節(jié)點表示最好應(yīng)用的連接方法.這顆樹有左右兩個子路徑(subpath)用以表示兩個relations的掃描或連接方法.

二、Join Tree Construction

Join Tree Construction
連接樹的構(gòu)造

The optimizer generates optimal query plans by doing a more-or-less
exhaustive search through the ways of executing the query.  The best Path
tree is found by a recursive process:

優(yōu)化器盡可能的通過窮盡搜索的方法生成最優(yōu)的查詢執(zhí)行計劃,最優(yōu)的訪問路徑樹通過以下遞歸過程實現(xiàn):

1).Take each base relation in the query, and make a RelOptInfo structure
for it.  Find each potentially useful way of accessing the relation,
including sequential and index scans, and make Paths representing those
ways.  All the Paths made for a given relation are placed in its
RelOptInfo.pathlist.  (Actually, we discard Paths that are obviously
inferior alternatives before they ever get into the pathlist --- what
ends up in the pathlist is the cheapest way of generating each potentially
useful sort ordering and parameterization of the relation.)  Also create a
RelOptInfo.joininfo list including all the join clauses that involve this
relation.  For example, the WHERE clause "tab1.col1 = tab2.col1" generates
entries in both tab1 and tab2's joininfo lists.

1).為查詢中每個基礎(chǔ)關(guān)系生成RelOptInfo結(jié)構(gòu)體.為每個基礎(chǔ)關(guān)系生成順序掃描或索引掃描訪問路徑.這些生成的訪問路徑存儲在RelOptInfo.pathlist鏈表中(實際上,在此過程中優(yōu)化器已經(jīng)拋棄了明顯不合理的訪問路徑,在pathlist中的路徑是生成排序路徑和參數(shù)化Relation的最可能路徑).在此期間,會生成RelOptInfo.joininfo鏈表,用于保存與此Relation相關(guān)的索引的連接語句(join clauses).比如,WHERE語句"tab1.col1 = tab2.col1"在tab1和tab2的joininfo鏈表中會產(chǎn)生相應(yīng)的數(shù)據(jù)結(jié)構(gòu)(entries).

If we have only a single base relation in the query, we are done.
Otherwise we have to figure out how to join the base relations into a
single join relation.

如果查詢中只有一個基礎(chǔ)關(guān)系,優(yōu)化器已完成所有工作,否則的話,優(yōu)化器需要得出如何連接基礎(chǔ)關(guān)系,從而得到一個新關(guān)系(通過join連接而來).

2).Normally, any explicit JOIN clauses are "flattened" so that we just
have a list of relations to join.  However, FULL OUTER JOIN clauses are
never flattened, and other kinds of JOIN might not be either, if the
flattening process is stopped by join_collapse_limit or from_collapse_limit
restrictions.  Therefore, we end up with a planning problem that contains
lists of relations to be joined in any order, where any individual item
might be a sub-list that has to be joined together before we can consider
joining it to its siblings.  We process these sub-problems recursively,
bottom up.  Note that the join list structure constrains the possible join
orders, but it doesn't constrain the join implementation method at each
join (nestloop, merge, hash), nor does it say which rel is considered outer
or inner at each join.  We consider all these possibilities in building
Paths. We generate a Path for each feasible join method, and select the
cheapest Path.

2)通常來說,顯式的JOIN語句已被扁平化(flattened)處理,優(yōu)化器可以直接根據(jù)關(guān)系鏈表進行連接.但是,全外連接(FULL OUTER JOIN)以及某些類型的JOIN無法扁平化(比如由于join_collapse_limit或from_collapse_limit這些約束條件).這里會遇到這么一個問題:嘗試以任意順序進行連接的關(guān)系鏈表,鏈表中的子鏈表必須在兩兩連接之前進行連接.優(yōu)化器會同自底向上的方式遞歸處理這些子問題.注意:連接鏈表限制了連接順序,但沒有限制連接的實現(xiàn)方法或者那個關(guān)系是內(nèi)表或外表,這些問題在生成訪問路徑時解決.優(yōu)化器會為每個可行的連接方式生成訪問路徑,并選擇其中成本最低的那個.

For each planning problem, therefore, we will have a list of relations
that are either base rels or joinrels constructed per sub-join-lists.
We can join these rels together in any order the planner sees fit.
The standard (non-GEQO) planner does this as follows:

對于每一個計劃過程中出現(xiàn)的問題,優(yōu)化器把每一個構(gòu)成子連接鏈表(sub-join-list)的
基礎(chǔ)關(guān)系或連接關(guān)系存儲在一個鏈表中.優(yōu)化器會根據(jù)看起來合適的順序連接這些關(guān)系.標準的(非遺傳算法,即動態(tài)規(guī)劃算法)計劃器執(zhí)行以下操作:

Consider joining each RelOptInfo to each other RelOptInfo for which there
is a usable joinclause, and generate a Path for each possible join method
for each such pair.  (If we have a RelOptInfo with no join clauses, we have
no choice but to generate a clauseless Cartesian-product join; so we
consider joining that rel to each other available rel.  But in the presence
of join clauses we will only consider joins that use available join
clauses.  Note that join-order restrictions induced by outer joins and
IN/EXISTS clauses are also checked, to ensure that we find a workable join
order in cases where those restrictions force a clauseless join to be done.)

對于每一個可用的連接條件,考慮使用兩兩連接的方式,為每一對RelOptInfo的連接生成訪問路徑.如果沒有連接條件,那么會使用笛卡爾連接,因此會優(yōu)先考慮連接其他可用的Relation.

If we only had two relations in the list, we are done: we just pick
the cheapest path for the join RelOptInfo.  If we had more than two, we now
need to consider ways of joining join RelOptInfos to each other to make
join RelOptInfos that represent more than two list items.

如果鏈表中只有2個Relations,優(yōu)化器已完成所有工作,為參與連接的RelOptInfo挑選最優(yōu)的訪問路徑即可.否則的話(>3個Relations),優(yōu)化器需要考慮如何兩兩進行連接.

The join tree is constructed using a "dynamic programming" algorithm:
in the first pass (already described) we consider ways to create join rels
representing exactly two list items.  The second pass considers ways
to make join rels that represent exactly three list items; the next pass,
four items, etc.  The last pass considers how to make the final join
relation that includes all list items --- obviously there can be only one
join rel at this top level, whereas there can be more than one join rel
at lower levels.  At each level we use joins that follow available join
clauses, if possible, just as described for the first level.

連接樹通過"動態(tài)規(guī)劃"算法構(gòu)造:
在第一輪(先前已描述)中,優(yōu)化器已完成兩個Relations的連接方式;第二輪,優(yōu)化器考慮如何創(chuàng)建三個Relations的join rels;下一輪,四個Relations,以此類推.最后一輪,考慮如何構(gòu)造包含所有Relations的join rel.顯然,在最高層,只有一個join rel,而在低層則可能會有多個join rel.在每一個層次上,如前所述,如果可以的話,優(yōu)化器會按照可用的連接條件進行連接.

For example:
SELECT  *
FROM    tab1, tab2, tab3, tab4
WHERE   tab1.col = tab2.col AND
tab2.col = tab3.col AND
tab3.col = tab4.col

Tables 1, 2, 3, and 4 are joined as:
{1 2},{2 3},{3 4}
{1 2 3},{2 3 4}
{1 2 3 4}
(other possibilities will be excluded for lack of join clauses)

SELECT  *
FROM    tab1, tab2, tab3, tab4
WHERE   tab1.col = tab2.col AND
tab1.col = tab3.col AND
tab1.col = tab4.col

Tables 1, 2, 3, and 4 are joined as:
{1 2},{1 3},{1 4}
{1 2 3},{1 3 4},{1 2 4}
{1 2 3 4}

We consider left-handed plans (the outer rel of an upper join is a joinrel,
but the inner is always a single list item); right-handed plans (outer rel
is always a single item); and bushy plans (both inner and outer can be
joins themselves).  For example, when building {1 2 3 4} we consider
joining {1 2 3} to {4} (left-handed), {4} to {1 2 3} (right-handed), and
{1 2} to {3 4} (bushy), among other choices.  Although the jointree
scanning code produces these potential join combinations one at a time,
all the ways to produce the same set of joined base rels will share the
same RelOptInfo, so the paths produced from different join combinations
that produce equivalent joinrels will compete in add_path().

下面來看看left-handed計劃,bushy計劃和right-handed計劃.比如,在構(gòu)建{1 2 3 4}4個關(guān)系的連接時,在眾多的選擇中存在以下方式,left-handed:{1 2 3} 連接 {4},right-handed:{4}連接{1 2 3}和bushy:{1 2}連接{3 4}.雖然掃描連接樹時一次產(chǎn)生這些潛在的連接組合,但是所有產(chǎn)生相同連接base rels集合的方法會共享相同的RelOptInfo的數(shù)據(jù)結(jié)構(gòu),因此這些不同的連接組合在生成等價的join rel時會調(diào)用add_path方法時相互PK.

The dynamic-programming approach has an important property that's not
immediately obvious: we will finish constructing all paths for a given
relation before we construct any paths for relations containing that rel.
This means that we can reliably identify the "cheapest path" for each rel
before higher-level relations need to know that.  Also, we can safely
discard a path when we find that another path for the same rel is better,
without worrying that maybe there is already a reference to that path in
some higher-level join path.  Without this, memory management for paths
would be much more complicated.

動態(tài)規(guī)劃方法有一個重要的特性(自底向上):那就是在構(gòu)建高層RelOptInfo的訪問路徑前,下層的RelOptInfo的訪問路徑已明確,而且優(yōu)化器確保該訪問路徑是成本最低的.

Once we have built the final join rel, we use either the cheapest path
for it or the cheapest path with the desired ordering (if that's cheaper
than applying a sort to the cheapest other path).

一旦完成了最終結(jié)果join rel的構(gòu)建,存在兩條路徑:成本最低或者按排序的要求最低

If the query contains one-sided outer joins (LEFT or RIGHT joins), or
IN or EXISTS WHERE clauses that were converted to semijoins or antijoins,
then some of the possible join orders may be illegal.  These are excluded
by having join_is_legal consult a side list of such "special" joins to see
whether a proposed join is illegal.  (The same consultation allows it to
see which join style should be applied for a valid join, ie, JOIN_INNER,
JOIN_LEFT, etc.)

如果查詢語句存在外連接或者轉(zhuǎn)換為半連接或反連接的IN或EXISTS語句,那么有些可能的連接順序是非法的.優(yōu)化器通過額外的方法進行處理.

以上是“PostgreSQL中query_planner主計劃函數(shù)make_one_rel的實現(xiàn)邏輯”這篇文章的所有內(nèi)容,感謝各位的閱讀!希望分享的內(nèi)容對大家有幫助,更多相關(guān)知識,歡迎關(guān)注創(chuàng)新互聯(lián)行業(yè)資訊頻道!

網(wǎng)頁題目:PostgreSQL中query_planner主計劃函數(shù)make_one_rel的實現(xiàn)邏輯
分享網(wǎng)址:http://muchs.cn/article0/ighcoo.html

成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供外貿(mào)網(wǎng)站建設(shè)虛擬主機、網(wǎng)站排名微信小程序、品牌網(wǎng)站設(shè)計App設(shè)計

廣告

聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請盡快告知,我們將會在第一時間刪除。文章觀點不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時需注明來源: 創(chuàng)新互聯(lián)

網(wǎng)站托管運營