* move crates into a 'crates' folder
* added cargo-about to check dependency licenses
* create ggml-sys bindings and build as a static library.
simple addition sanity test passes
* update licenses
* yeet alice
* yeet qa model
* yeet fact model
* [wip] idiomatic rust bindings for ggml
* [ggml] mul, add and sub ops implemented for tensors.
i think it would be easier to try and implement a bert model in order to figure out which ops we should include in the binding. for instance, is view and concat needed?
the summarization becomes much better if we allow the model to first generate a candidate summarization and then improving on it.
doing the improvement step just once seems to significantly improve the summary.
we also now use an llm (mistral 7b) for the summarisations, as we can then use the same model for multiple tasks and serve it using gpus, thus significantly decreasing the latency.