Foreword
The LoongArch has garnered much attention since its public debut in 2021, disproportionate to its infancy status both in terms of age and market share (and we are being complimentary here). Most of public information surrounding this architecture, however, comes from press releases of the Loongson Technology Corporation Limited (the Loongson Corporation; website is Chinese-only); this is in stark contrast with all the attention it is receiving from (open-source) communities worldwide. This may be enough for those people whose job is primarily attending conferences telling stories, making (often empty) promises for bringing investment, but definitely nowhere near satisfactory for ordinary developers who have to get the actual job done.
This FAQ document strives to tell the facts around the LoongArch, in hopes of being useful to fellow developers. But commercial things are invariably controversial, so we also make an effort to take a neutral stance and try to equally present the disagreeing opinions.
This document is being updated from time to time, and changes are always accompanied with update dates. The version you are currently reading is last updated at 2022-07-23. (Dates are always in the YYYY-MM-DD format, for ease of tracking changes between the original and the translations.)
Disclaimer: Information presented in this document is all taken from publicly available sources, except those explicitly marked as opinions. Opinions are always explicitly marked as such, and are strictly personal and have nothing to do with the author’s employer, Loongson Corporation or any other entity. The author is not affiliated with any of the companies part of the Loongson or MIPS ecosystem.
Changelog
You can view the change details at this article’s Git history.
- 2022-11-23: Updated the upstreaming progress section.
- 2022-07-23: Updated the upstreaming progress section.
- 2022-07-18: Updated the upstreaming progress section.
- 2022-04-26: Updated the upstreaming progress section; support has been merged in dotnet.
- 2022-04-21: Minor updates.
- Updated the upstreaming progress section.
- Added external link to the Gentoo/LoongArch project.
- LoongArch’s Chinese name may already be decided.
- 2022-03-30: Updated the upstreaming progress section.
- 2022-03-06: Added several more topics; minor tweaks all over.
- 2022-02-21: (English version only) Added note on the meanings of “loong” and “Loongson”.
- 2022-02-20: Added translation to English; synced wording adjustments and layout tweaks with the Chinese original.
- 2022-02-15: Adjusted wording.
- 2022-02-13: Adjusted wording; added information about instruction formats and assembly language.
- 2022-02-12: Initial version.
About the ISA
What’s LoongArch?
The LoongArch architecture (LoongArch) is an Instruction Set Architecture (ISA) that has Reduced Instruction Set Computer (RISC) style.
– LoongArch Reference Manual, Volume 1: Basic Architecture
LoongArch is an instruction set architecture designed by the Loongson Corporation, publicly announced in 2020. Shipping started in 2021 with the 3A5000 products.
The Chinese name for LoongArch was supposed to be 龙芯架构 (in Simplified characters, because the Loongson Corporation is based in Beijing; 龍芯架構 in Traditional characters), according to the title and first sentence of the original manual. It just means “Loongson Architecture”. Loongson Corporation applied for the Chinese trademark “龙芯架构” in 2022-01-29.
Note: the word 龙/龍/loong means “Chinese dragon”, or more precisely, just “loong”. The Chinese dragon never breathes fire, for example. (Actually some of them mostly bring about rains and storms when they feel like doing so!)
And the character 芯 means “core/chip” in this context, so the “龙芯/龍芯/Loongson” name has a literal meaning of “Dragon’s Core” or “Dragon’s Chip”.
However, about one month later, the corporation filed another trademark application, this time “龙架构” (Loong Architecture / Dragon Architecture); and in 2022-04-13 there was a press release with both “LoongArch” and “龙架构” mentioned. This is likely an indication that the Chinese name of LoongArch is finally decided to be “龙架构”.
What does LoongArch’s logo look like?
LoongArch does not have a logo according to public information. Its trademark is just plain text.
It is actually strange to not have a logo, though, because a good logo certainly helps a lot in brand promotion. Hope we can see one in 2022!
What’s the etymology of the word LoongArch, and how to pronounce it?
(Note: the English pronunciation described here is American.)
Obviously the word is a portmanteau of “龙芯” (Loongson) and “architecture”. Because of this, it should also be pronounced as such, as a mixture of the two words: /lʊŋ˧˥ɕin˥˥/ + /ˈɑɹkɪtɛkt͡ʃɚ/ = /ˈlʊŋ˧˥ˌɑɹk/ (“龙Arc”, “lóng arc”). This would be “lóng à ke” in typical Chinglish accent 😏
In practice, the “Arch” part is often just pronounced /ɑɹt͡ʃ/, the same reason why the word “char” often does not get pronounced as “car”. This pronunciation is acceptable as well. This would be “lóng à chi” or “lóng à qu” in typical Chinglish accent 😏
What are the basic features of LoongArch as an ISA?
LoongArch is a register-register architecture which:
- supports 32-bit and 64-bit operations,
- is little-endian-only,
- has 32-bit fixed-length instruction words.
Some observations
LoongArch opcodes all extend from MSB to LSB, i.e. are allocated in a “prefix encoding” fashion.
This is helpful for conserving the encoding space. Of course, it also means that there is no well-defined “opcode” field in LoongArch; although the 6 highest bits are currently guaranteed to be part of the opcode, and can tell something about instructions’ “functional classification”, there is little more.
Author’s comments:
Prefix encoding is not the optimal choice from a purely technical perspective: suffix encoding achieves the same conservation effect, while also enabling transparent support for compressed instructions on little-endian architectures.
Take the RVC extension for example: all information necessary for determining the instruction length is guaranteed to be present in the first byte fetched, thus the decoder can always correctly figure out the instruction length without asking to fetch more bytes than strictly necessary.
The LoongArch approach precludes the possibility of shorter instruction words in the same machine mode, because the opcode sits at MSB side and has no well-defined segments, making it impossible to see enough of opcode with instruction fetches shorter than 4 bytes. For example, suppose the first instruction after a reset or a jump is 2 bytes long. Fetching 4 bytes is clearly wrong here; but if only 2 bytes are fetched while the instruction is actually 4 bytes long, then the fetch may well only see the LSB-portion that are actually operands. But the operand fields are arbitrarily specified by the programmer, and boom! the core runs amok.
If this problem was not overlooked in the design phase, then the most probable reason behind the design decision could be that “code density improvements with 16-bit instruction words are not worthwhile for actual business cases”.
LoongArch is a rather classical RISC design.
Complete with fixed-length instructions, 32 registers, hard-wired zero register, 3-operand instructions, pure computations that do not touch memory, flat memory model, etc…
Some of the LoongArch operations are more powerful than (pre-R6) MIPS and RISC-V (RV64G).
Jumps and PIC-related operations have wide immediate fields; immediates are loaded in 4 instructions at most without shifting; the ABI even reserves one register for future use, while having enough for almost all cases; various bitwise operations lacking in the RISC-V base ISA are present in that of LoongArch.
On the widths of operations
LoongArch follows the classical approach (as with x86 or MIPS) in defining the
widths for operations: for almost all operations, the operand width of a
specific opcode does not change with the register width, as determined by the
µarch or the current machine mode.
For example, the add.d
instruction either is illegal in the current machine
mode / on a particular core, or always represents the 64-bit addition
operation.
The add.w
instruction always exists (because there is no LoongArch core with
at most 16-bit support), and always represents the 32-bit addition operation.
Μarch/Machine mode | Instruction | Legal? | Operation represented |
---|---|---|---|
LA32 | add.w | ⭕ | 32-bit addition |
LA32 | add.d | ❌ | - |
LA64 | add.w | ⭕ | 32-bit addition |
LA64 | add.d | ⭕ | 64-bit addition |
Compare this with the RISC-V approach: still using additions for our example,
the add
instruction always operates on native (XLEN) width operands,
representing the 32-bit addition on RV32 cores, and 64-bit addition on RV64
cores.
Meanwhile, the addw
instruction brought by RV64 only operates on the lower
32-bit even on RV64 cores, but this instruction does not exist on RV32 cores.
Μarch/Machine mode | Instruction | Legal? | Operation represented |
---|---|---|---|
RV32 | addw | ❌ | - |
RV32 | add | ⭕ | Native-width addition (XLEN=32) |
RV64 | addw | ⭕ | 32-bit addition |
RV64 | add | ⭕ | Native-width addition (XLEN=64) |
(Trivia: this is one of the biggest mistakes your author made, while reverse-engineering LoongArch from scratch before Loongson released the ISA manual – assuming that LoongArch specified its operand widths just like RISC-V. 😂)
How many instruction formats do LoongArch have?
tl;dr
There are 9 typical instruction formats. But in fact there are 39, based on real effort needed in porting low-level software.
Loongson’s “official” stance
According to LoongArch Reference Manual, Volume 1, there are 9 typical instruction formats.
No immediate | With immediate |
---|---|
2R | 2RI8 |
3R | 2RI12 |
4R | 2RI14 |
2RI16 | |
1RI21 | |
I26 |
Which look like this when pictured:
Format | Instruction word | |||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
2R | rj | rd | ||||||||||||||||||||||||||||||
3R | rk | rj | rd | |||||||||||||||||||||||||||||
4R | ra | rk | rj | rd | ||||||||||||||||||||||||||||
2RI8 | imm | rj | rd | |||||||||||||||||||||||||||||
2RI12 | imm | rj | rd | |||||||||||||||||||||||||||||
2RI14 | imm | rj | rd | |||||||||||||||||||||||||||||
2RI16 | imm | rj | rd | |||||||||||||||||||||||||||||
1RI21 | imm low | rj | imm high | |||||||||||||||||||||||||||||
I26 | imm low | imm high | ||||||||||||||||||||||||||||||
Note: cells with this background represent opcode bits. |
There are a few instructions whose encoding style is not completely equivalent to these 9 typical instruction formats. However, the number of such instructions is small and the instructions change little, which will not be inconvenient for compiler developers.
“change little” 😏
The truth
According to the answer above, in fact LoongArch has no well-defined instruction formats or operand slots. Although most instructions have reasonably consistent encodings, the few that do require special encoding are encoded almost arbitrarily.
Indeed, people still have to define all the instruction format variants when developing (dis-)assemblers, because the machine does not care which format is “more typical” than others; different is different. We can observe this phenomenon in most open-source projects with this kind of low-level handling:
- binutils: MIPS、RISC-V、SPARC; there may or may not be definitions for the different instruction formats, but there are always complete description for operand slots.
- LLVM: MIPS、RISC-V; significantly more instruction formats defined than the few “basic formats” described on manuals.
- QEMU: HPPA、RISC-V; ditto.
If we classify the instruction formats according to the strict rule of “different bit-fields or different operand types, different format”, then the v1.00 LoongArch base ISA has a total of 39 distinct instruction formats. The community-maintained loongarch-opcodes project provides a collection of all publicly known LoongArch instructions, and a precise naming scheme for instruction formats. (Disclaimer: your author is the maintainer of this project.)
Here are the 39 precisely defined LoongArch instruction formats (consult the loongarch-opcodes documentation for meanings of the operand slot names):
Format | Instruction word | |||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
CdFj | Fj | Cd | ||||||||||||||||||||||||||||||
CdFjFk | Fk | Fj | Cd | |||||||||||||||||||||||||||||
CdJ | J | Cd | ||||||||||||||||||||||||||||||
CjSd5k16 | Sd5k16 low | Cj | Sd5k16 high | |||||||||||||||||||||||||||||
DCj | Cj | D | ||||||||||||||||||||||||||||||
DFj | Fj | D | ||||||||||||||||||||||||||||||
DJ | J | D | ||||||||||||||||||||||||||||||
DJK | K | J | D | |||||||||||||||||||||||||||||
DJKUa2 | Ua2 | K | J | D | ||||||||||||||||||||||||||||
DJKUa3 | Ua3 | K | J | D | ||||||||||||||||||||||||||||
DJSk12 | Sk12 | J | D | |||||||||||||||||||||||||||||
DJSk14 | Sk14 | J | D | |||||||||||||||||||||||||||||
DJSk16 | Sk16 | J | D | |||||||||||||||||||||||||||||
DJUk12 | Uk12 | J | D | |||||||||||||||||||||||||||||
DJUk14 | Uk14 | J | D | |||||||||||||||||||||||||||||
DJUk5 | Uk5 | J | D | |||||||||||||||||||||||||||||
DJUk5Um5 | Um5 | Uk5 | J | D | ||||||||||||||||||||||||||||
DJUk6 | Uk6 | J | D | |||||||||||||||||||||||||||||
DJUk6Um6 | Um6 | Uk6 | J | D | ||||||||||||||||||||||||||||
DJUk8 | Uk8 | J | D | |||||||||||||||||||||||||||||
DSj20 | Sj20 | D | ||||||||||||||||||||||||||||||
DUj5 | Uj5 | D | ||||||||||||||||||||||||||||||
EMPTY | ||||||||||||||||||||||||||||||||
FdCj | Cj | Fd | ||||||||||||||||||||||||||||||
FdFj | Fj | Fd | ||||||||||||||||||||||||||||||
FdFjFk | Fk | Fj | Fd | |||||||||||||||||||||||||||||
FdFjFkCa | Ca | Fk | Fj | Fd | ||||||||||||||||||||||||||||
FdFjFkFa | Fa | Fk | Fj | Fd | ||||||||||||||||||||||||||||
FdJ | J | Fd | ||||||||||||||||||||||||||||||
FdJK | K | J | Fd | |||||||||||||||||||||||||||||
FdJSk12 | Sk12 | J | Fd | |||||||||||||||||||||||||||||
JK | K | J | ||||||||||||||||||||||||||||||
JKUd5 | K | J | Ud5 | |||||||||||||||||||||||||||||
JSd5k16 | Sd5k16 low | J | Sd5k16 high | |||||||||||||||||||||||||||||
JUd5 | J | Ud5 | ||||||||||||||||||||||||||||||
JUd5Sk12 | Sk12 | J | Ud5 | |||||||||||||||||||||||||||||
JUk8 | Uk8 | J | ||||||||||||||||||||||||||||||
Sd10k16 | Sd10k16 low | Sd10k16 high | ||||||||||||||||||||||||||||||
Ud15 | Ud15 | |||||||||||||||||||||||||||||||
Note: cells with this background represent opcode bits. |
As is shown, LoongArch actually has vastly more complex encodings than MIPS or RISC-V, both of which have about 20 distinct formats at most. Although some of the LoongArch formats are definitely mergeable (for example the FdCj and CdFj formats; they are the same if we do not pursue complete correspondence of operand order in assembler syntax), that ship has sailed; also the current scheme does conserve a lot of encoding space after all.
Is the LA464 µarch the same thing as GS464V? Why did it get renamed?
This answer contains speculations.
Frankly speaking, the Loongson Corporation does not really put any special consideration in naming its micro-architectures or cores. For example, the 3A4000 processor contains 4 GS464V cores, but the cores of the much earlier 3B1500 are also called GS464V in the documentation.
Author’s comment: There is the GS464EV name on Wikipedia, and it is coined by the community exactly because of the desire to avoid ambiguity. The µarch of 3A2000/3A3000 is called GS464E; 3A4000, as a “tock”/µarch iteration, finally gained usable vector support by implementing MSA, hence the GS464EV name.
(The former vector instructions developed in-house are lacking in terms of functionality and documentation, so this is why we call MSA “usable”.)
As for the LA464/GA464V, there are expressions like “adjusting ISA of present IP cores” (“调整现有 IP 核的指令系统”) that can be seen in earlier public articles or presentations about the development of the 3A5000 or the LoongArch, e.g. the August 2020 keynote by HU Weiwu. Considering that the 3A4000 and 3A5000 belong to the same tock-tick iteration, such wording may imply that the 3A5000 is just the 3A4000 with a replaced decoder. However, it is not okay to re-use the name, as the instruction set is incompatible after all; the Loongson Corporation ultimately chose to modify all of its documentation and open-source code to mass-replace GS464V with LA464 for the new LoongArch model, on August 2021.
So, overall it is more appropriate to consider the LA464 and GS464V as a not-so-similar “pair of twins”, with roughly the same micro-architecture, but different supported ISA.
Trivia: “GS464” stands for “Godson 4-issue 64-bit”.
“Godson”, despite its obvious impression for an English speaker, is in fact just a sound approximation of “狗剩” (gǒu shèng), “dog leftovers”. Back in the pre-industrial times, it was believed by some Chinese families that newborns with such lowly names are easier to raise; this tradition is known in Chinese as “赖名好养活” or “贱名好养活”.
The switch to the “LA” prefix is presumably because all “GS” cores implemented some variant of MIPS, and the company wanted to cut off this relationship, though.
What’s the relationship between LoongArch and MIPS?
(Please note that all RISC architectures bear a significant resemblance to each other, because all of them are made to perform the same thing called “general-purpose computation”.)
According to public sources, LoongArch and MIPS cannot interoperate, and there is no 1:1 correspondence between some of the important architectural features; though such correspondence exists for many of their instruction semantics.
- LoongArch has entirely different instruction encoding than MIPS.
- LoongArch does not have any form of branch delay slots, while MIPS did not gain optional delay-slot-less branches/jumps until R6.
- LoongArch does not feature some of the historical warts of MIPS, for example the “wonderful” HI/LO accumulators.
- LoongArch’s ABI is based on that of RISC-V, departing from the MIPS tradition. Concepts such as dedicated return value registers and registers reserved for kernel use are abolished.
However, there is objective MIPS influence on LoongArch. For example:
- LoongArch uses predicate registers
$fccX
for floating-point comparison and branches, which are 8 distinct flag bits; this is the same as MIPS, and rarely seen on other modern architectures. - LoongArch’s privileged architecture is similar to that of MIPS. For example, the LoongArch TLB has a special even/odd entry distinction; this is cumbersome, hence not seen on any other prominent architecture but MIPS.
- Some instructions have identical semantics as their MIPS R6 counterparts,
while similar semantically-identical instructions are not found in any other
prominent architecture.
The LoongArch
maskeqz/masknez
and the MIPS R6selnez/seleqz
are such an example. - The way some operations are implemented are similar to MIPS R6, only with
minor changes.
For example, 64-bit immediates are materialized in 4 segments, and the 4
instructions used are
lu12i.w/ori/lu32i.d/lu52i.d
for LoongArch (here using official mnemonics). They only differ from their MIPS R6 counterparts (lui/ori/dahi/dati
) in the length of each segment: for LoongArch it is12/20/20/12
, while for MIPS it is16/16/16/16
. Also, this particular way of materializing immediates is not seen on any other prominent architecture. - The Loongson Virtualization Extension is abbreviated as LVZ which is extremely dubious. Because “Loongson SIMD Extension” is LSX, “Loongson Advanced SIMD Extension” is LASX and “Loongson Binary Translation Extension” is LBT (asymmetry here; it should have been called LBTX), the virtualization extension should be abbreviated LVX or LV; in no way it is conformant to pick two letters from the second word to get LVZ. VZ is MIPS’s name for its virtualization ASE!
- The LoongArch assembler syntax is similar to that of MIPS.
Parentheses around memory operands are removed, but registers still have to
carry a
$
prefix, and pseudo-instructions such asmove
are named after the MIPS counterparts, different from RISC-V etc. - Early LoongArch ports of fundamental software such as the toolchain or the Linux kernel are basically just copy-pastes of MIPS code, mass-replacing “MIPS” with “LOONGARCH” along the way. (Of course, because the quality of such code is bound to be low, and the two architectures are not that similar after all, the Loongson Corporation stopped doing this not before long.)
What’s the relationship between LoongArch and RISC-V?
(Please note that all RISC architectures bear a significant resemblance to each other, because all of them are made to perform the same thing called “general-purpose computation”.)
According to public sources, LoongArch and RISC-V cannot interoperate, and there is no 1:1 correspondence between some of the important architectural features; though such correspondence exists for many of their instruction semantics if certain conditions are met (restricted to 64-bit operation, for example).
- LoongArch’s privileged architecture and memory management are significantly different to those of RISC-V.
- The supported operations of LoongArch base ISA are mostly a superset of its RISC-V counterpart.
- RISC-V always sign-extends its immediate operands, while LoongArch differentiates based on type of operation (sign-extending for arithmetic operations; zero-extending for logical operations).
- LoongArch has opcodes starting from MSB, precluding compressed instruction words in the RVC way.
There is significant RISC-V influence in the software part of LoongArch, such as the ABI and several fundamental toolchain pieces. LoongArch’s ABI is rather similar to RISC-V’s, and the semantic similarities of instructions are notable as well. Often, simple syntactic tweaks to the RISC-V version are all it takes to port a primitive to LoongArch for some fundamental software.
Some LoongArch instructions have identical semantics as their RISC-V counterparts; some of the architectural features are similar as well. For example:
- The PIC-related
pcaddu12i
behaves the same as the RISC-Vauipc
. - The register jump
jirl
behaves the same as the RISC-Vjalr
, very unlike the MIPSjalr
. - The timekeeping instructions
rdtime.*
are semantically similar to their RISC-V counterparts in that they all return values from a constant-frequency counter. - LoongArch’s privileged resources live in the CSR space, and the CSR concept obviously comes from RISC-V. This is already the case back to 3A5000’s predecessor, 3A4000.
- RISC-V originally defined 4 privilege levels (from high to low, Machine/Hypervisor/Supervisor/User; Hypervisor was later removed), and LoongArch defines 4 too (from high to low, PLV0/PLV1/PLV2/PLV3). However, LoongArch OSes run at the highest level of PLV0, while it is recommended for RISC-V OSes to run at the Supervisor level.
And these are possible influences of RISC-V to LoongArch as well.
How many ABIs do LoongArch have?
LoongArch currently defines 3x2=6 ABIs, according to the LoongArch ELF psABI.
Data models of which are:
- ILP32 (
int
,long
and pointers are 32-bit wide; a 32-bit model but not completely excluding 64-bit operations) - LP64 (
long
and wider types and pointers are 64-bit wide; this is the most common 64-bit data model in Linux space)
And the floating-point support are:
- Soft-float (S)
- Single-precision hard-float (F)
- Double-precision hard-float (D)
Currently only the LP64D ABI is fully supported. All publicly available commercial distributions for LoongArch are built with this ABI. If you attempt to use the other ABIs, you are very likely to get all kinds of compilation errors, so usage of these ABIs is not recommended at this early stage of bring-up. In particular, the support for the ILP32 ABIs are known to be very incomplete, and it is extremely likely that builds will just error out immediately if one ever try.
What happened to LoongArch’s vector extension?
This answer is speculative because the relevant documentation has not been released.
The 3A4000 from the end of Loongson’s MIPS era contains a complete implementation of MIPS’s MSA vector extension. In addition to that, the 3A4000 also has support for the LoongMMI which is inherited from the 2F era, and the LSX/LASX that never appeared in public documentation. Let’s summarize all these vector extensions:
- MSA: 128-bit fixed vector width, according to the MSA64 documentation v1.12.
- LoongMMI: usage extremely similar to the x86 MMX; 64-bit fixed vector width.
- LSX/LASX: public information nearly non-existent aside from a few PPT slides, open-source toolchain code only briefly and quietly appearing before being redacted. LSX should have a fixed vector length of 128-bit, while LASX should have 256.
As can be seen, all the implemented vector instructions operate on fixed
vector lengths.
Taking the description of the LSX
and LASX
bits in the
LoongArch Reference Manual, Volume 1, Section 2.2.10.5 Table 3
“The configuration information accessible by the CPUCFG instruction”
into consideration
(“128-bit vector extension” and “256-bit vector extension” respectively),
it is presumed that LoongArch’s LSX/LASX are similar to the LSX/LASX from the
MIPS era; at least the vector width should be fixed as well.
The instruction encodings must have been changed, and some instructions may
get added or removed as well;
there is no public documentation and open-source support after all,
so no external code makes use of these,
and compatibility is not a concern in this case.
Note that novel vector extensions in the recent years are all scalable, such as the AArch64 SVE and the RISC-V RVV: software is able to dynamically configure the vector unit to pick the vector width most suitable to the requirement at hand, also meaning software does not need to be modified to take advantage of wider hardware implementations. This is generally a welcomed trend, and we noticed a change preparing for scalable vectors in Loongson’s glibc fork. Considering there may be multiple reasons for not releasing the LSX/LASX to the public (especially IP concerns), this might mean that LSX/LASX would never become public, and that a scalable vector implementation similar to RVV would be available at some future time.
What happened to LoongArch’s binary translation extension?
This answer is speculative because the relevant documentation has not been released.
Similar to the case of vector extension, we can speculate based on the binary translation extension from the MIPS era. Although there is no public instruction encodings nor kernel support, we actually already had a sneak peek at the extension, by means of some academic report or public presentations done by Loongson themselves (the August 2020 keynote by HU Weiwu, the April 2021 presentation by ZHANG Fuxin):
- An EFLAGS register is added to the architectural state, along with purely EFLAGS-updating counterparts for some basic instructions;
- A TOP register is added to the architectural state, along with the
corresponding FPU mode bit; semantics of FP register operands is altered to
be TOP-based if the x87 emulation mode is enabled (e.g. meaning of
$f2
becomes something like$f(TOP + 2)
instead).
Loongson later added support for other operations too, such as the ARM conditional execution, and the approach should be similar. So, the x86 and ARM translation aid of LoongArch is likely to be minor tweaks to the previous binary translation extension. As for the translation of MIPS, because both MIPS and LoongArch are classical RISC designs, branch delay slots may be the only hardware aid needed. (Other weird MIPS features such as HI/LO registers are easily implemented in software at translation time, because you have to recover the data flow regardless.)
About software development
There seems to be many organizations related to Loongson/LoongArch on GitHub. Which of these are “official”?
You may have seen one of these organizations already:
- The “Great Loongson Union” (loongson) (your author really dislikes the title btw),
- The LoongArch porting group (loongarch64), and
- The Loongson Community (loongson-community).
Long story short: The list is in decreasing “officiality”, but as with nearly everything Loongson, “officiality” may or may not be something you would want.
Or if you prefer the long story…
In the beginning, the Loongson Corporation has no presence on GitHub (or any of the “international” code forges) at all, so at 2015-07-14, a group of disgruntled open-source developers set up the loongarch-community organization as a central place for collaboration. Almost all organization members have no affiliation with the Loongson Corporation; in part because of this, and in part because this particular way of making technical decisions and doing things is not agreed by some of the Loongson people, the Loongson Corporation never officially acknowledged this organization as its “community”, even to this day. You know, it is hard for the company to officially recognize the community when it housed goodies like reverse-engineered docs for the 3A4000’s crypto instructions, reverse-engineered LoongArch instructions even before the official manual release, and collection of other docs free for download with the official website now requiring registration (read: business lead) for access to any documentation…
Fast forward to 2020, when the LoongArch was already announced at August, but without any open-source code drops yet; longtime MATE developer and open-source contributor @yetist was hired by the Loongson Corporation to work on bringing up the LoongArch. Seeing that there was not a single recognized community out there that their colleagues would like to participate in, naturally, they set up the loongarch64 organization for facilitating early code reviews, at 2020-09-29, just before the National Day holiday. Many people from different backgrounds are invited to the organization; many are Loongson employees, but there are also Deepin/UOS employees and many unaffiliated developers. Internal procedures are set up to ensure quality of submissions; code review is required, and at least 3 approvals are required before merging any PR. Loongson employees are not exempt from the rules, and there has not been any violation so far. Although there are low-quality reviews or even blind LGTM’s from time to time, they are invariably caught by the more prudent reviewers.
Later, in 2021 (perhaps triggered by the need to upstream the Go loong64 port, but your author’s memories may be failing him), some other group (maybe a different department) inside Loongson found also the need for a GitHub organization, and discovered that they had already registered the loongson organization back at 2020-01-01 but had not made any use of it. For unknown reasons, they did not pursue cooperation with the loongarch64 effort; instead they just quietly began their work on “their” organization. No invitations was ever extended to non-Loongson employees, and team structures seemed like replication of the corporate organizational structure. Code reviews are also mandated and it is the same 3-approvals rule for merging, but FWIW blind and/or rushed LGTM’s are more common than in loongarch64. There are cases where PRs are merged with little discussion, or closed without any comment; the latter is already considered impolite in community etiquette. Changes not “approved” by Loongson are unlikely to be merged, and it seems there will be a CLA in place for external contributors soon. All in all, this organization is more about internal collaboration and bug reporting than community participation; for code contributions, it may be better to submit directly to the respective “true” upstreams instead.
Trivia: A quick way to compare the openness of the three organizations is to look at the access level of your author. Your author is an owner of the loongson-community organization, an ordinary member of the loongarch64 organization but with write access, while not a part of the loongson organization at all; and that tells a lot!
Are the various LoongArch ports upstreamed already?
There are many pieces, all with varying upstream statuses; some are progressing smoothly, while some are under heated discussion. Your author has summarized the situation with the tables below.
Table legend:
- ✅ – upstreamed and released
- ⏳ – upstreamed, pending release
- 🔍 – under upstream review
- 🔧 – WIP, or under community pre-review before first upstream submission
- ❌ – TODO
(Based on information as of 2022-11-23.)
Emulator and firmware
Project | Status | Dev repository | Notes |
---|---|---|---|
QEMU (target) | ✅ | - | For emulating LoongArch on other arches. Released in 7.1, fully usable in 7.2. |
QEMU (host) | ✅ | - | For emulating other arches on LoongArch hosts. Released in 7.0. |
EDK II | ⏳ | - | Merged. QEMU firmware support pending review. |
Kernels
Project | Status | Dev repository | Notes |
---|---|---|---|
Linux | 🔍 | loongarch-next for end users, and for upstream | Kernel ABI finalized in v5.19, irqchip changes integrated in v6.0, initial EFI boot support went in v6.1. Out-of-the-box usability expected in v6.2. The “for-upstream” loongarch-next branch will only contain code that has passed reviews; head over to GitHub for ready-to-use (bootable) branch. |
FreeBSD | ❌ | - | |
OpenBSD | ❌ | - | |
RT-Thread | ❌ | - | This is an original Chinese RTOS. Support has been added in its commercial/professional edition, but not the open-source branch. |
GNU Toolchain
Project | Status | Dev repository | Notes |
---|---|---|---|
binutils | ✅ | Loongson fork | Initial support appeared in 2.38, but is incomplete; psABI already incompatibly revised meanwhile. 2.39 is usable but it is recommended to wait for 2.40 for full support of the new psABI. |
gcc | ✅ | Loongson fork | Released in 12.1.0, but it is recommended to stay on bleeding edge (i.e. 13.0.0 snapshots) for the new ELF psABI. |
glibc | ✅ | Loongson fork | Released in 2.36. |
Other toolchain pieces/languages
Project | Status | Dev repository | Notes |
---|---|---|---|
musl | 🔍 | Loongson fork | Under review. |
llvm | ⏳ | Loongson fork | The forked repo does not contain up-to-date code; follow SixWeining’s activities for progress. 16.0.0 should be usable out-of-the-box. |
rust | 🔍 | - | Rushed initial bring-up and MCP. |
go | ✅ | - | Released in go1.19. |
dotnet | ✅ | - | LoongArch64 support has been merged. Released in 7.0. |
openjdk | 🔧 | Loongson fork, Loongson’s jdk8u fork | Build support upstreamed, JIT porting ongoing. |
v8 | ✅ | - | Reviewed and merged. Released in 9.5.3. |
nodejs | ✅ | - | Supported since v18.0.0. |
Other infrastructure projects
Project | Status | Dev repository | Notes |
---|---|---|---|
libbsd | ✅ | - | LoongArch64 support has been merged. Released in 0.11.6. |
libffi | ✅ | - | Merged, improved, and released in 3.4.3. |
libseccomp | 🔍 | GitHub PR | 99% done. |
libunwind | ⏳ | - | LoongArch64 support has been merged. Awaiting upstream release. |
strace | ✅ | - | LoongArch64 support has been merged. Released in 5.17. |
systemd | ✅ | LoongArch64 porting group fork | Basic support appeared in v250 along with new discoverable partition types defined for LoongArch64. |
util-linux | ✅ | Support for the new discoverable partition types already merged. Released in 2.38. |
I’m going to port my software to LoongArch. What do I need to prepare for?
You don’t have to specially prepare for anything!
The expectation is for LoongArch to become a “normal” platform for software and hardware development. You just do on LoongArch whatever you used to do for other platforms, such as x86 or ARM, except for those things inherently platform-specific.
If you primarily develop high-level “business logic” with high-level programming languages, you almost never need to care about low-level technical details of the platform. These kind of things are already taken care of by the open-source community, consisting of all enterprise and individual developers using Loongson products.
If you are an infrastructure developer yourself, or a “business logic” developer that occasionally needs to care about low-level details here and there, the LoongArch documentation provided by the Loongson Corporation is a good starting point.
What does LoongArch’s target tuple look like?
First of all, we need to differentiate between the GNU target tuples and the
Debian multi-arch tuples, because the two are not always the same;
in particular, for targets supporting multiple
ABIs, this fact is guaranteed
to be reflected in the Debian multi-arch tuples.
A complete target tuple looks like ARCH-VENDOR-OS-ENV
, but the vendor field
is often omitted because few contemporary architectures make use of it, making
it ARCH-OS-ENV
instead; this short form is also called the target triplet.
The most common configuration for a LoongArch system is one running Linux with
the LP64D ABI and glibc.
According to the LoongArch toolchain convention,
the Debian multi-arch tuple for this configuration is loongarch64-linux-gnuf64
.
The corresponding GNU target tuple is loongarch64-unknown-linux-gnu
;
the unknown
part can be dropped.
(The ABI suffixes for
floating-point support and extensions can be omitted, if the configuration is
the most common for the given ARCH.)
Note: You may have seen this change to gnuconfig from late 2020, which differs from the latest spec: no consideration for the ABI floating-point or extension suffixes, and a mysterious
loongarchx32
check. This is because the change reflected the earliest understanding of the LoongArch ABI inside the Loongson Corporation: the three ARCH valuesloongarch32
loongarch64
loongarchx32
corresponds 1:1 to the three MIPS ABIs o32 n64 and n32. Of course, people finally realized there is absolutely no reason to blindly copy MIPS when you are starting from scratch after all, so the ABI was re-modeled after that of RISC-V; the so-called “x32” ABI for LoongArch will never get implemented as a result.
What does LoongArch’s GOARCH value look like?
The Loongson Corporation had an argument with the community about the topic back in mid-2021. 😆
The Loongson team ultimately accepted the community proposal though, so the
GOARCH for the LA64 is confirmed to be loong64
.
Because 32-bit support is still missing here and there, there is no such thing
as loong32
at the moment.
How do I quickly familiarize myself with LoongArch assembly?
The manual and the ABI spec are your friends 😉
Syntactically, LoongArch’s assembly language is basically a simplified version of MIPS assembly, but there are a few important differences as well. Based on personal experiences, it is easy to quickly on-board oneself by “thinking in RISC-V while writing MIPS”; this, coupled with manual reading, it is easy to master the language as well.
- Registers must be prefixed with
$
, like MIPS. (In RISC-V assembly this is not necessary.) - The ABI divides registers
into three classes
$a*
$t*
$s*
, like RISC-V. (Different from MIPS; there is no distinct$v*
nor$k*
.) - The way of doing PIC is partly
like RISC-V (
pcaddu12i
is equivalent to RISC-Vauipc
, used in PLT stubs), and partly like AArch64 (pcalau12i
is equivalent to AArch64adrp
, used by all ELF psABI v2.00 relocations); both usages are vastly different from MIPS. (The abicall convention is a compromise to the limited functionality of the pre-R6 MIPS ISA, and as such, there is no point carrying it over to the new era.) - The way of doing TLS is the same as
RISC-V.
(Different from MIPS; LoongArch has the dedicated
$tp
register so it is no longer necessary to workaround this with things likerdhwr
.) - The register move pseudo-instruction is called
move
, like MIPS. (Different from x86 or RISC-V;mov
ormv
are not recognized.) - The no-op is spelled
nop
as with most architectures. (Syntactic sugar forandi $zero, $zero, 0
.) - Return from subroutine is
jr $ra
, like MIPS. (Syntactic sugar forjirl $zero, $ra, 0
; the even more convenientret
will only be available from binutils 2.40 and LLVM 16 onwards.) - Different from MIPS, there are no parentheses around registers that represent
memory operands.
(
ld $a0, 16($a1)
becomesld.d $a0, $a1, 16
.) - A width suffix is needed for the
li
pseudo-instruction as well. (li.w
suffices most of the time; it is seldom necessary to load constants wider than 32 bits.) - As for operand ordering of instructions, most follow the rule of registers before immediates, and from LSB to MSB in each group. Note that there are exceptions if using manual syntax!
Aside from these, there are some known inconsistent, misleading or even errorneous descriptions existing in the current version (v1.00) of the ISA manual, due to Loongson not soliciting reviews from the wider community before publishing the manuals. These are all described in the loongarch-opcodes project’s documentation.
Author’s comments:
The loongarch-opcodes review feedbacks are already sent back to relevant teams at Loongson. But they replied with something like “It’s impossible to modify the manuals like that now, after publication, in part also because there’s no precedent of any other company doing this such as Intel or ARM; developers just have to take some more time to get accustomed” 😏 As if developers inside Loongson actually had similar expertise as their fellow Intel or ARM developers, and that everything is done correct in one go. Or do they?
Regardless, your author and other friends in support of the project are still actively communicating and pushing every fix and improvement forward, in hopes of eliminating as many warts as possible before LoongArch is known to a wider audience. We do not want future developers to fall for the same traps that we had already fallen into.
I’m using C/C++. How do I specify the CFLAGS on LoongArch? How to conditionally compile for LoongArch and its features?
Please consult the LoongArch toolchain convention.
I don’t have LoongArch hardware. How do I test my software on it nevertheless?
You could use QEMU for this most of the time. Both system emulation (emulating a complete LoongArch computer) and user-mode emulation (emulating a LoongArch Linux syscall interface on top of the host Linux kernel) are supported. Usage of QEMU is outside the scope of this documentation; consult other online resources for that.
Note: Target support for LoongArch is fully upstreamed as of 2022-07-23, but QEMU 7.1.0 still contained bugs that effectively prevented linux-user emulation of LoongArch from working, and system emulation mildly suffered as well. QEMU 7.2.0 should be usable out of the box.
About usage
What’s the Gentoo ARCH for LoongArch systems?
Gentoo has assigned ARCH=loong
for LoongArch, according to the upstream communication back in August 2021.
This means you are going to write things like ACCEPT_KEYWORDS="~loong"
.
And this time everything went smoothly, unlike the Go situation. 🤣
What firmware/BIOS does LoongArch systems use? How do I manage boot options? What bootloader do I use?
The “BIOS” concept is already obsolete for a millenia, why do people still call firmwares as such? 🤦
Aside from that, both desktop and server LoongArch systems comply to the UEFI specification, and both use ACPI to communicate information about devices. Unlike the earlier ad-hoc MIPS UEFI implementation that was never upstreamed, the UEFI and ACPI implementations for LoongArch have already completed the upstream process, and will be officially supported starting from the next release of the specs. Congratulations to Loongson’s firmware team BTW!
The LoongArch UEFI implementation is fairly standard. This means things are more-or-less the same as the other UEFI platforms such as x86 or ARM64, regarding boot options management and bootloader choice and usage. Whatever (reasonably portable piece) you currently use on the other platforms, like efibootmgr, systemd-boot, grub2, Linux EFI stub, etc., it is (or should be) the same on LoongArch!
What GPUs can be used on LoongArch systems?
As of 2022-07-18, there is no LoongArch CPU in SoC form, so every LoongArch system out there invariably includes a bridge chip. At this time point, there is only one model of bridge chip, the LS7A1000, that can work with the only LoongArch CPU in existence – the Loongson 3A5000; this bridge chip includes a GPU block that is based on the Vivante GC1000, together with an in-house display controller block. Upstream work for this integrated GPU is currently in progress. (The etnaviv driver cannot work as-is.)
As for compatibility of discrete GPUs, basically any GPU with open-source driver will work if the required firmware is present on the system, typically installed with the linux-firmware package. Of course this largely eliminates the green camp, because the chance for a LoongArch blob is slim (blame Jensen Huang), and the open-source nouveau is borderline unusable (also blame Jensen Huang). AMD Yes!
What soundcard/network interface card/capture card/mouse/keyboard/HDD/$INSERT_YOUR_PERIPHERAL
can be used on LoongArch systems?
You can use the said hardware as long as it has open-source Linux support. Closed-source drivers that have LoongArch versions are fine too, if the vendor happen to be so merciful as to provide LoongArch blobs; or if you are some powerful figure who can persuade the vendor into providing such blobs.
If a particular piece of hardware does not work even if a LoongArch driver exists, tell us about it in any of the LoongArch user community; you can be sure a developer will see it. 😉
About the LoongArch ecosystem
Which Linux distributions have been ported to LoongArch?
Thanks to prioritized hardware access and team collaboration provided by the Loongson Corporation, the commercial development around LoongArch is progressing very rapidly.
As of 2022-07-23, multiple commercial distributions (developed by China mainland entities) already provide LoongArch ports. These include but are not limited to: (in alphabetical order)
- Kylin (from the Kylin Software Corporation; website is Chinese-only)
- Loongnix (from the Loongson Corporation; website is Chinese-only)
- UOS (from the UnionTech Corporation; website is Chinese-only)
Loongnix claims to be the “Linux OS from the Loongson open-source community”, but because there are actually very few external participants if at all, and some of its packages are not open-source (especially the toolchain; the current Loongnix LoongArch port even has vector extension support!), this distribution is in effect a commercial one.
After the publication of the various LoongArch documentation, and open-sourcing of Loongson forks of fundamental pieces of software, the porting pace of community distributions has accelerated as well.
There are several ongoing porting efforts as of 2022-11-23, including but not limited to: (in alphabetical order)
Why can’t I run closed-source software like WPS Office on community distributions? (aka What’s this so-called “old world” and “new world”?)
As of 2022-07-18, all commercial LoongArch distributions are incompatible with all community distributions. All binary software built on community distributions, and some software written in high-level languages and existing in forms like source code or bytecode (such as those written in Python or Java) cannot run on commercial distributions, and vice versa. All closed-source software from ISVs such as WPS Office are built on commercial distributions, so they are extremely unlikely to work as-is on community distributions.
This is the so-called compatibility problem between the old-world and the new-world. Because the Loongson Corporation finished all commercial moves before announcing the LoongArch to the open-source community, the open-source LoongArch ecosystem is the new world; in contrast to this, all commercial distributions and the ecosystem associated make up the old world. The two worlds are to be united eventually, but are two parallel universes for now; and the technical difficulty of making the two worlds compatible with each other is enormous.
As for the reason for two worlds and ways to be compatible, that is a long story ;-) Let’s save this for another article, dedicated to the topic and to be written shortly.
What LoongArch hardware can I buy?
Although international ways of purchasing Loongson products has largely died out after the 2F era (when international marketing of Loongson products were taken care of by STMicroelectronics), at least in China mainland, you can easily get your hands on various products with LoongArch CPUs, all of which are publicly available. For example, there are ATX boards, prebuilt computers and notebooks with the 3A5000; and there are rack-mounted servers for the 3C5000L.
This document cannot provide any link to actual products or shops, due to your author having no affiliation with them, but you can always search for “龙芯” yourself on popular Chinese shopping sites like ○宝, ○东 or ○鱼 (you will know the exact names if you have literally any clue about the Chinese language and the Chinese Internet life in general, or just give this document to one of your helpful Chinese friends 😆).
Note: the Loongson products are almost always significantly more expensive than similarly specced “mainstream” x86 or ARM products, due to the small production volume. On top of that, there are more fundamental problems for the Loongson platform to solve at this early stage of development, so it is not recommended for casual users to buy.
How can I get acquainted with other fellow LoongArch fans?
Communities exist for any technology with users, and LoongArch is no exception (your author and his friends are not cats, after all).
There are a lot of places on the Internet where Loongson and LoongArch topics are discussed. Some places frequented by many people are (but not limited to):
Forum of the Loongson Open-source Community (龙芯开源社区论坛)(Originally at bbs.loongnix.cn, shut down by Loongson Corporation)- LoongArch Unofficial Open-Source Community (spiritual successor of bbs.loongnix.cn; predominately Chinese)
- Loongson Bar on Baidu Tieba (百度贴吧龙芯吧) (Chinese-only)
- Telegram Loongson group (Chinese & English; predominately Chinese but average English proficiency is high)
- QQ group 922566903 (Chinese-only)
There may be other QQ groups discussing Loongson topics, but your author does not use QQ so he does not know the number. (PM and PRs are welcomed for suggesting more of them!)
Note that these public venues are better suited to general chatter, instead of serious technical discussion. Because some of the end-user fans take rather radical technical/political positions, technical discussions may easily get derailed if any of the more sensitive topics is inadvertently touched. However, you should be able to find the appropriate place for communication between developers if you possess the desired communication attitude and ability; due to this, links to such places are left out for now. ;-)