Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.
`static const` variables can be computed and initialized at run-time
during initialization or the first time a function is called. Change
them to `static constexpr` to ensure they are computed at
compile-time.
This allows some removal of `strlen` because the length of the
`StringView` can be used which is pre-computed at compile-time.
Instead of setting an error in the execution context, we can directly
return that error or the successful value. This lets all callers, who
were already TRY-capable, simply TRY the expression evaluation.
The result of a SQL statement execution is either:
1. An error.
2. The list of rows inserted, deleted, selected, etc.
(2) is currently represented by a combination of the Result class and
the ResultSet list it holds. This worked okay, but issues start to
arise when trying to use Result in non-statement contexts (for example,
when introducing Result to SQL expression execution).
What we really need is for Result to be a thin wrapper that represents
both (1) and (2), and to not have any explicit members like a ResultSet.
So this commit removes ResultSet from Result, and introduces ResultOr,
which is just an alias for AK::ErrorOrr. Statement execution now returns
ResultOr<ResultSet> instead of Result. This further opens the door for
expression execution to return ResultOr<Value> in the future.
Lastly, this moves some other context held by Result over to ResultSet.
This includes the row count (which is really just the size of ResultSet)
and the command for which the result is for.
This statement (for now) outputs the name and types of the different
attributes in a table. It's not standard SQL but all DBMSs that I know
of implement a sort of statement for such functionality.
Since the output of DESCRIBE TABLE is just a relation, an internal
schema, `master` was created and a table definition for DESCRIBE into
it. The table definition and the master schema are not accessible by the
user.
The implementation of LIKE uses regexes under the hood, and this
implementation of REGEXP takes the same approach. It employs
PosixExtended from LibRegex with case insensitive and Unicode flags
set. The implementation of LIKE is based on SQLlite specs, but SQLlite
does not offer directions for a built-in regex functionality, so this
one uses LibRegex.
Ordering is done by replacing the straight Vector holding the query
result in the SQLResult object with a dedicated Vector subclass that
inserts result rows according to their sort key using a binary search.
This is done in the ResultSet class.
There are limitations:
- "SELECT ... ORDER BY 1" (or 2 or 3 etc) is supposed to sort by the
n-th result column. This doesn't work yet
- "SELECT ... column-expression alias ... ORDER BY alias" is supposed to
sort by the column with the given alias. This doesn't work yet
What does work however is something like
```SELECT foo FROM bar SORT BY quux```
i.e. sorted by a column not in the result set. Once functions are
supported it should be possible to sort by random functions.
The evaluation order of method parameters is unspecified in C++, and
so we couldn't rely on parse_statement() being called before
parse_escape() when building a MatchExpression.
With this patch, we explicitly parse what we need in the right order,
before building the MatchExpression object.
This option is already enabled when building Lagom, so let's enable it
for the main build too. We will no longer be surprised by Lagom Clang
CI builds failing while everything compiles locally.
Furthermore, the stronger `-Wsuggest-override` warning is enabled in
this commit, which enforces the use of the `override` keyword in all
classes, not just those which already have some methods marked as
`override`. This works with both GCC and Clang.
Fixes a crash that was caused by a syntax error which is difficult to
catch by the parser: usually identifiers are accepted in column lists,
but they are not in a list of column values to be inserted in an INSERT.
Fixed this by putting in a heuristic check; we probably need a better
way to do this.
Included tests for this case.
Also introduced a new SQL Error code, `NotYetImplemented`, and return
that instead of crashing when encountering unimplemented SQL.
The handling of filesystem level errors was basically non-existing or
consisting of `VERIFY_NOT_REACHED` assertions. Addressed this by
* Adding `open` methods to `Heap` and `Database` which return errors.
* Changing the interface of methods of these classes and clients
downstream to propagate these errors.
The constructors of `Heap` and `Database` don't open the underlying
filesystem file anymore.
The SQL statement handlers return an `SQLErrorCode::InternalError`
error code if an error comes back from the lower levels. Note that some
of these errors are things like duplicate index entry errors that should
be caught before the SQL layer attempts to actually update the database.
Added tests to catch attempts to open weird or non-existent files as
databases.
Finally, in between me writing this patch and submitting the PR the
AK::Result<Foo, Bar> template got deprecated in favour of ErrorOr<Foo>.
This resulted in more busywork.
This patch introduces table joins. It uses a pretty dumb algorithm-
starting with a singleton '__unity__' row consisting of a single boolean
value, a cartesian product of all tables in the 'FROM' clause is built.
This cartesian product is then filtered through the 'WHERE' clause,
again without any smarts just using brute force.
This patch required a bunch of busy work to allow for example the
ColumnNameExpression having to deal with multiple tables potentially
having columns with the same name.
Because SQL is the craptastic language that it is, sometimes expressions
need to know details about the calling statement. For example the tables
in the 'FROM' clause may be needed to determine which columns are
referenced in 'WHERE' expressions. So the current statement is added
to the ExecutionContext and a new 'execute' overload on Statement is
created which takes the Database and the Statement and builds an
ExecutionContaxt from those.
There was a lot of `VERIFY_NOT_REACHED` error handling going on. Fixed
most of those.
A bit of a caveat is that after every `evaluate` call for expressions
that are part of a statement the error status of the `SQLResult` return
value must be called.
Filters matching rows by doing a table scan and evaluating the `WHERE`
expression for every row.
Does not use indexes, for one because they do not exist yet.
Mostly just calls the appropriate methods on the Value objects.
Exception are the `Concatenate` (string concat), and the logical `and`
and `or` operators which are implemented directly in
`BinaryOperatorExpression::evaluate`
Up to now the only ``SELECT`` statement that worked was ``SELECT *
FROM <table>``. This commit allows a column list consisting of
column names and expressions in addition to ``*``. ``WHERE``
still doesn't work though.
Data types are now checked against the table data types. When multiple
rows are inserted at once, we check all rows to be matching W.R.T data
types. Only then we insert the rows.
This adds the ability to parse SQL INSERT statements in the following
form:
INSERT INTO schema.tablename VALUES (column1, column2, ...),
(column1, column2, ...), ...
Our existing implementation did not check the element type of the other
pointer in the constructors and move assignment operators. This meant
that some operations that would require explicit casting on raw pointers
were done implicitly, such as:
- downcasting a base class to a derived class (e.g. `Kernel::Inode` =>
`Kernel::ProcFSDirectoryInode` in Kernel/ProcFS.cpp),
- casting to an unrelated type (e.g. `Promise<bool>` => `Promise<Empty>`
in LibIMAP/Client.cpp)
This, of course, allows gross violations of the type system, and makes
the need to type-check less obvious before downcasting. Luckily, while
adding the `static_ptr_cast`s, only two truly incorrect usages were
found; in the other instances, our casts just needed to be made
explicit.
This patch provides very basic, bare bones implementations of the
INSERT and SELECT statements. They are *very* limited:
- The only variant of the INSERT statement that currently works is
SELECT INTO schema.table (column1, column2, ....) VALUES
(value11, value21, ...), (value12, value22, ...), ...
where the values are literals.
- The SELECT statement is even more limited, and is only provided to
allow verification of the INSERT statement. The only form implemented
is: SELECT * FROM schema.table
These statements required a bit of change in the Statement::execute
API. Originally execute only received a Database object as parameter.
This is not enough; we now pass an ExecutionContext object which
contains the Database, the current result set, and the last Tuple read
from the database. This object will undoubtedly evolve over time.
This API change dragged SQLServer::SQLStatement into the patch.
Another API addition is Expression::evaluate. This method is,
unsurprisingly, used to evaluate expressions, like the values in the
INSERT statement.
Finally, a new test file is added: TestSqlStatementExecution, which
tests the currently implemented statements. As the number and flavour of
implemented statements grows, this test file will probably have to be
restructured.
This patch introduces the ability execute parsed SQL statements. The
abstract AST Statement node now has a virtual 'execute' method. This
method takes a Database object as parameter and returns a SQLResult
object.
Also introduced here is the CREATE SCHEMA statement. Tables live in a
schema, and if no schema is present in a table reference the 'default'
schema is implied. This schema is created if it doesn't yet exist when
a Database object is created.
Finally, as a proof of concept, the CREATE SCHEMA and CREATE TABLE
statements received an 'execute' implementation. The CREATE TABLE
method is not able to create tables created from SQL queries yet.
The Order enum is used in the Meta component of LibSQL. Using this enum
meant having to include the monster AST/AST.h include file. Furthermore,
they are sort of basic and therefore can live in the general SQL
namespace. Moved to LibSQL/Type.h.
Also introduced a new class, SQLResult, which is needed in future
patches.
SQL was standardized before there was consensus on sane language syntax
constructs had evolved. The language is mostly case-insensitive, with
unquoted text converted to upper case. Identifiers can include lower
case characters and other 'special' characters by enclosing the
identifier with double quotes. A double quote is escaped by doubling it.
Likewise, a single quote in a literal string is escaped by doubling it.
All this means that the strategy used in the lexer, where a token's
value is a StringView 'window' on the source string, does not work,
because the value needs to be massaged before being handed to the
parser. Therefore a token now has a String containing its value. Given
the limited lifetime of a token, this is acceptable overhead.
Not doing this means that for example quote removal and double quote
escaping would need to be done in the parser or in AST node
construction, which would spread lexing basically all over the place.
Which would be suboptimal.
There was some impact on the sql utility and SyntaxHighlighter component
which was addressed by storing the token's end position together with
the start position in order to properly highlight it.
Finally, reviewing the tests for parsing numeric literals revealed an
inconsistency in which tokens we accept or reject: `1a` is accepted but
`1e` is rejected. Related to this is the fate of `0x`. Added a FIXME
reminding us to address this.