Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markdown: Close anchor tags correctly #38619

Merged
merged 1 commit into from
Jun 30, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 74 additions & 74 deletions docs/coding-guidelines/clr-code-guide.md

Large diffs are not rendered by default.

198 changes: 99 additions & 99 deletions docs/coding-guidelines/clr-jit-coding-conventions.md

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/design/coreclr/botr/corelib.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ FCalls require a lot of boilerplate code, too much to describe here. Refer to [f

[fcall]: https://github.com/dotnet/runtime/blob/master/src/coreclr/src/vm/fcall.h

### <a name="gcholes" /> GC holes, FCall, and QCall
### <a name="gcholes"></a> GC holes, FCall, and QCall

A more complete discussion on GC holes can be found in the [CLR Code Guide](../../../coding-guidelines/clr-code-guide.md). Look for ["Is your code GC-safe?"](../../../coding-guidelines/clr-code-guide.md#2.1). This tailored discussion motivates some of the reasons why FCall and QCall have some of their strange conventions.

Expand Down Expand Up @@ -248,7 +248,7 @@ FCIMPL1(Object*, AppDomainNative::IsStringInterned, StringObject* pStringUNSAFE)
FCIMPLEND
```

## <a name="register" /> Registering your QCall or FCall method
## <a name="register"></a> Registering your QCall or FCall method

The CLR must know the name of your QCall and FCall methods, both in terms of the managed class and method names, as well as which native methods to call. That is done in [ecalllist.h][ecalllist], with two arrays. The first array maps namespace and class names to an array of function elements. That array of function elements then maps individual method names and signatures to function pointers.

Expand Down
4 changes: 2 additions & 2 deletions docs/design/coreclr/jit/first-class-structs.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ These work items are organized in priority order. Each work item should be able
proceed independently, though the aggregate effect of multiple work items may be greater
than the individual work items alone.

### <a name="defer-abi-specific-transformations-to-lowering"/>Defer ABI-specific transformations to Lowering
### <a name="defer-abi-specific-transformations-to-lowering"></a>Defer ABI-specific transformations to Lowering

This includes all copies and IR transformations that are only required to pass or return the arguments
as required by the ABI.
Expand Down Expand Up @@ -273,7 +273,7 @@ This would be enabled first by [Defer ABI-specific transformations to Lowering](
* Related: #6839, #9477, #16887
* Also, #11888, which suggests adding a struct promotion stress mode.

### <a name="Block-Assignments"/>Improve and Simplify Block and Block Assignment Morphing
### <a name="Block-Assignments"></a>Improve and Simplify Block and Block Assignment Morphing

* `fgMorphOneAsgBlockOp` should probably be eliminated, and its functionality either moved to
`Lowering` or simply subsumed by the combination of the addition of fixed-size struct types and
Expand Down
38 changes: 19 additions & 19 deletions docs/design/coreclr/jit/lsra-detail.md
Original file line number Diff line number Diff line change
Expand Up @@ -386,7 +386,7 @@ critical edges. This also captured in the `LsraBlockInfo` and is used by the res

### Building Intervals and RefPositions

`Interval`s are built for lclVars up-front. These are maintained in an array,
`Interval`s are built for lclVars up-front. These are maintained in an array,
`localVarIntervals` which is indexed by the `lvVarIndex` (not the `varNum`, since
we never allocate registers for non-tracked lclVars). Other intervals (for tree temps and
internal registers) are constructed as the relevant node is encountered.
Expand All @@ -402,7 +402,7 @@ node, which builds `RefPositions` according to the liveness model described abov

- Then we create `RefPosition`s for each use in the instruction.

- A use of a register candidate lclVar becomes a `RefTypeUse` `RefPosition` on the
- A use of a register candidate lclVar becomes a `RefTypeUse` `RefPosition` on the
`Interval` associated with the lclVar.

- For tree-temp operands (including non-register-candidate lclVars), we may have one
Expand Down Expand Up @@ -451,7 +451,7 @@ node, which builds `RefPositions` according to the liveness model described abov

During this phase, preferences are set:

- Cross-interval preferences are expressed via the `relatedInterval` field of `Interval`
- Cross-interval preferences are expressed via the `relatedInterval` field of `Interval`

- When a use is encountered, it is preferenced to the target `Interval` for the
node, if that is deemed to be profitable. During register selection, it tries to
Expand All @@ -469,7 +469,7 @@ During this phase, preferences are set:

- Issue [#22374](https://github.com/dotnet/coreclr/issues/22374) also has a pointer
to some methods that could benefit from improved preferencing.

- Register preferences are set:

- When the use or definition of a value must use a fixed register, due to instruction
Expand Down Expand Up @@ -540,10 +540,10 @@ LinearScanAllocation(List<RefPosition> refPositions)
`Interval` to which it is preferenced, if any

- Whether it is in the register preference set for the
`Interval`
`Interval`

- Whether it is not only available but currently unassigned
(i.e. this register is NOT currently assigned to an `Interval`
(i.e. this register is NOT currently assigned to an `Interval`
which is not currently live, but which previously occupied
that register).

Expand Down Expand Up @@ -756,7 +756,7 @@ enregisterable variable or temporary or physical register. It contains
- `RefTypeZeroInit` is an `Interval` `RefPosition` that represents the
position at entry at which a variable will be initialized to
zero.

- `RefTypeUpperVectorSave` is a `RefPosition` for an upper vector `Interval`
that is inserted prior to a call that will kill the upper vector if
it is currently occupying a register. The `Interval` is then marked with
Expand Down Expand Up @@ -928,7 +928,7 @@ The potential enhancements to the JIT, some of which are referenced in this docu

## Code Quality Enhancements

### <a name="combine"/>Merge Allocation of Free and Busy Registers
### <a name="combine"></a>Merge Allocation of Free and Busy Registers

This is captured as [\#15408](https://github.com/dotnet/coreclr/issues/15408)
Consider merging allocating free & busy regs.
Expand Down Expand Up @@ -1012,8 +1012,8 @@ One strategy would be to do something along the lines of (appropriate hand-wavin
the predecessor `varToRegMap`, iterate over the most frequently lclVars in the union of the
live-in, uses and defs, and displace any `Intervals` that are occupying registers that
would be more profitably used by the high-frequencly lclVars, weighing spill costs.
### <a name="avoid-split"/>Avoid Splitting Loop Backedges

### <a name="avoid-split"></a>Avoid Splitting Loop Backedges

This is captured as Issue [\#16857](https://github.com/dotnet/coreclr/issues/16857).

Expand Down Expand Up @@ -1050,7 +1050,7 @@ investigating whether it would be worthwhile and cheaper to simply track this in
### Support Reg-Optional Defs

Issues [\#7752](https://github.com/dotnet/coreclr/issues/7752) and
[\#7753](https://github.com/dotnet/coreclr/issues/7753) track the
[\#7753](https://github.com/dotnet/coreclr/issues/7753) track the
proposal to support "folding" of operations using a tree temp when
the defining operation supports read-modify-write (RMW) to memory.
This involves supporting the possibility
Expand All @@ -1059,14 +1059,14 @@ never occupy a register.

### Don't Pre-determine Reg-Optional Operand

Issue [\#6361](https://github.com/dotnet/coreclr/issues/6361)
Issue [\#6361](https://github.com/dotnet/coreclr/issues/6361)
tracks the problem that `Lowering` currently has
to select a single operand to be reg-optional, even if either
operand could be. This requires some additional state because
LSRA can't easily navigate from one use to the other to
communicate whether the first operand has been assigned a
register.

### Leveraging SSA form

This has not yet been opened as a github issue.
Expand Down Expand Up @@ -1129,30 +1129,30 @@ performance. This would also improve JIT throughput only for optimized code.
References
----------

1. <a name="[1]"/> Boissinot, B. et
1. <a name="[1]"></a> Boissinot, B. et
al "Fast liveness checking for ssa-form programs," CGO 2008, pp.
35-44.
http://portal.acm.org/citation.cfm?id=1356058.1356064&coll=ACM&dl=ACM&CFID=105967773&CFTOKEN=80545349

2. <a name="[2]"/> Boissinot, B. et al, "Revisiting
2. <a name="[2]"></a> Boissinot, B. et al, "Revisiting
Out-of-SSA Translation for Correctness, Code Quality and
Efficiency," CGO 2009, pp. 114-125.
<http://portal.acm.org/citation.cfm?id=1545006.1545063&coll=ACM&dl=ACM&CFID=105967773&CFTOKEN=80545349>


3. <a name="[3]"/>Wimmer, C. and Mössenböck, D. "Optimized
3. <a name="[3]"></a>Wimmer, C. and Mössenböck, D. "Optimized
Interval Splitting in a Linear Scan Register Allocator," ACM VEE
2005, pp. 132-141.
<http://portal.acm.org/citation.cfm?id=1064998&dl=ACM&coll=ACM&CFID=105967773&CFTOKEN=80545349>

4. <a name="[4]"/> Wimmer, C. and Franz, M. "Linear Scan
4. <a name="[4]"></a> Wimmer, C. and Franz, M. "Linear Scan
Register Allocation on SSA Form," ACM CGO 2010, pp. 170-179.
<http://portal.acm.org/citation.cfm?id=1772979&dl=ACM&coll=ACM&CFID=105967773&CFTOKEN=80545349>

5. <a name="[5]"/> Traub, O. et al "Quality and Speed in Linear-scan Register
5. <a name="[5]"></a> Traub, O. et al "Quality and Speed in Linear-scan Register
Allocation," SIGPLAN '98, pp. 142-151.
<http://portal.acm.org/citation.cfm?id=277650.277714&coll=ACM&dl=ACM&CFID=105967773&CFTOKEN=80545349>

6. <a name="[6]"/> Olesen, J. "Greedy Register Allocation in LLVM 3.0," LLVM Project Blog, Sept. 2011.
6. <a name="[6]"></a> Olesen, J. "Greedy Register Allocation in LLVM 3.0," LLVM Project Blog, Sept. 2011.
<http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html>
(Last retrieved Feb. 2012)
42 changes: 21 additions & 21 deletions docs/design/coreclr/jit/ryujit-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,12 +224,12 @@ The top-level function of interest is `Compiler::compCompile`. It invokes the fo
| [Register allocation](#reg-alloc) | Registers are assigned (`gtRegNum` and/or `gtRsvdRegs`), and the number of spill temps calculated. |
| [Code Generation](#code-generation) | Determines frame layout. Generates code for each `BasicBlock`. Generates prolog & epilog code for the method. Emit EH, GC and Debug info. |

## <a name="pre-import"/>Pre-import
## <a name="pre-import"></a>Pre-import

Prior to reading in the IL for the method, the JIT initializes the local variable table, and scans the IL to find
branch targets and form BasicBlocks.

## <a name="importation">Importation
## <a name="importation"></a>Importation

Importation is the phase that creates the IR for the method, reading in one IL instruction at a time, and building up
the statements. During this process, it may need to generate IR with multiple, nested expressions. This is the
Expand All @@ -245,7 +245,7 @@ and flagged. They are further validated, and possibly unmarked, during morphing.

The `fgMorph` phase includes a number of transformations:

### <a name="inlining"/>Inlining
### <a name="inlining"></a>Inlining

The `fgInline` phase determines whether each call site is a candidate for inlining. The initial determination is made
via a state machine that runs over the candidate method's IL. It estimates the native code size corresponding to the
Expand All @@ -256,7 +256,7 @@ encountered that indicate that it may be unprofitable (or otherwise incorrect).
inlinee compiler's trees are incorporated into the inliner compiler (the "parent"), with arguments and return values
appropriately transformed.

### <a name="struct-promotion"/>Struct Promotion
### <a name="struct-promotion"></a>Struct Promotion

Struct promotion (`fgPromoteStructs()`) analyzes the local variables and temps, and determines if their fields are
candidates for tracking (and possibly enregistering) separately. It first determines whether it is possible to
Expand All @@ -269,26 +269,26 @@ individually referenced.
When a lclVar is promoted, there are now N+1 lclVars for the struct, where N is the number of fields. The original
struct lclVar is not considered to be tracked, but its fields may be.

### <a name="mark-addr-exposed"/>Mark Address-Exposed Locals
### <a name="mark-addr-exposed"></a>Mark Address-Exposed Locals

This phase traverses the expression trees, propagating the context (e.g. taking the address, indirecting) to
determine which lclVars have their address taken, and which therefore will not be register candidates. If a struct
lclVar has been promoted, and is then found to be address-taken, it will be considered "dependently promoted", which
is an odd way of saying that the fields will still be separately tracked, but they will not be register candidates.

### <a name="morph-blocks"/>Morph Blocks
### <a name="morph-blocks"></a>Morph Blocks

What is often thought of as "morph" involves localized transformations to the trees. In addition to performing simple
optimizing transformations, it performs some normalization that is required, such as converting field and array
accesses into pointer arithmetic. It can (and must) be called by subsequent phases on newly added or modified trees.
During the main Morph phase, the boolean `fgGlobalMorph` is set on the `Compiler` argument, which governs which
transformations are permissible.

### <a name="eliminate-qmarks"/>Eliminate Qmarks
### <a name="eliminate-qmarks"></a>Eliminate Qmarks

This expands most `GT_QMARK`/`GT_COLON` trees into blocks, except for the case that is instantiating a condition.

## <a name="flowgraph-analysis"/>Flowgraph Analysis
## <a name="flowgraph-analysis"></a>Flowgraph Analysis

At this point, a number of analyses and transformations are done on the flowgraph:

Expand All @@ -298,7 +298,7 @@ At this point, a number of analyses and transformations are done on the flowgrap
* Identifying and normalizing loops (transforming while loops to "do while")
* Cloning and unrolling of loops

## <a name="normalize-ir"/>Normalize IR for Optimization
## <a name="normalize-ir"></a>Normalize IR for Optimization

At this point, a number of properties are computed on the IR, and must remain valid for the remaining phases. We will
call this "normalization"
Expand All @@ -310,12 +310,12 @@ counts are needed.
* `optOptimizeBools` – this optimizes Boolean expressions, and may change the flowgraph (why is it not done prior to reachability and dominators?)
* Link the trees in evaluation order (setting `gtNext` and `gtPrev` fields): and `fgFindOperOrder()` and `fgSetBlockOrder()`.

## <a name="ssa-vn"/>SSA and Value Numbering Optimizations
## <a name="ssa-vn"></a>SSA and Value Numbering Optimizations

The next set of optimizations are built on top of SSA and value numbering. First, the SSA representation is built
(during which dataflow analysis, aka liveness, is computed on the lclVars), then value numbering is done using SSA.

### <a name="licm"/>Loop Invariant Code Hoisting
### <a name="licm"></a>Loop Invariant Code Hoisting

This phase traverses all the loop nests, in outer-to-inner order (thus hoisting expressions outside the largest loop
in which they are invariant). It traverses all of the statements in the blocks in the loop that are always executed.
Expand All @@ -326,7 +326,7 @@ If the statement is:
* Does not raise an exception OR occurs in the loop prior to any side-effects
* Has a valid value number, and it is a lclVar defined outside the loop, or its children (the value numbers from which it was computed) are invariant.

### <a name="copy-propagation"/>Copy Propagation
### <a name="copy-propagation"></a>Copy Propagation

This phase walks each block in the graph (in dominator-first order, maintaining context between dominator and child)
keeping track of every live definition. When it encounters a variable that shares the VN with a live definition, it
Expand All @@ -335,20 +335,20 @@ is replaced with the variable in the live definition.
The JIT currently requires that the IR be maintained in conventional SSA form, as there is no "out of SSA"
translation (see the comments on `optVnCopyProp()` for more information).

### <a name="cse"/>Common Subexpression Elimination (CSE)
### <a name="cse"></a>Common Subexpression Elimination (CSE)

Utilizes value numbers to identify redundant computations, which are then evaluated to a new temp lclVar, and then
reused.

### <a name="assertion-propagation"/>Assertion Propagation
### <a name="assertion-propagation"></a>Assertion Propagation

Utilizes value numbers to propagate and transform based on properties such as non-nullness.

### <a name="range-analysis"/>Range analysis
### <a name="range-analysis"></a>Range analysis

Optimize array index range checks based on value numbers and assertions.

## <a name="rationalization"/>Rationalization
## <a name="rationalization"></a>Rationalization

As the JIT has evolved, changes have been made to improve the ability to reason over the tree in both "tree order"
and "linear order". These changes have been termed the "rationalization" of the IR. In the spirit of reuse and
Expand Down Expand Up @@ -473,7 +473,7 @@ t0 = LCL_VAR byref V03 arg3 u:1 (last use) $c0
RETURN void $200
```

## <a name="lowering"/>Lowering
## <a name="lowering"></a>Lowering

Lowering is responsible for transforming the IR in such a way that the control flow, and any register requirements,
are fully exposed.
Expand Down Expand Up @@ -529,7 +529,7 @@ In such cases, it must ensure that they themselves are properly lowered. This in

After all nodes are lowered, liveness is run in preparation for register allocation.

## <a name="reg-alloc"/>Register allocation
## <a name="reg-alloc"></a>Register allocation

The RyuJIT register allocator uses a Linear Scan algorithm, with an approach similar to [[2]](#[2]). In discussion it
is referred to as either `LinearScan` (the name of the implementing class), or LSRA (Linear Scan Register
Expand Down Expand Up @@ -622,7 +622,7 @@ Post-conditions:
* `lvSpilled` flag is set if it is ever spilled
* The maximum number of simultaneously-live spill locations of each type (used for spilling expression trees) has been communicated via calls to `compiler->tmpPreAllocateTemps(type)`.

## <a name="code-generation"/>Code Generation
## <a name="code-generation"></a>Code Generation

The process of code generation is relatively straightforward, as Lowering has done some of the work already. Code
generation proceeds roughly as follows:
Expand Down Expand Up @@ -858,8 +858,8 @@ a 'T'.

## References

<a name="[1]"/>
<a name="[1]"></a>
[1] P. Briggs, K. D. Cooper, T. J. Harvey, and L. T. Simpson, "Practical improvements to the construction and destruction of static single assignment form," Software --- Practice and Experience, vol. 28, no. 8, pp. 859---881, Jul. 1998.

<a name="[2]"/>
<a name="[2]"></a>
[2] Wimmer, C. and Mössenböck, D. "Optimized Interval Splitting in a Linear Scan Register Allocator," ACM VEE 2005, pp. 132-141. [http://portal.acm.org/citation.cfm?id=1064998&dl=ACM&coll=ACM&CFID=105967773&CFTOKEN=80545349](http://portal.acm.org/citation.cfm?id=1064998&dl=ACM&coll=ACM&CFID=105967773&CFTOKEN=80545349)