ente/architecture
Manav Rathi e9d76688ce Move to monorepo
Move all of our code into a monorepo in preparation of open sourcing our server.

First I describe the general plan, then later I've kept an exact log of the
commands that I used. This was all done prior to this commit, but this commit
(that introduces the various top level files) seems like a good way to summarize
the entire process.

Clone auth. Auth is our base repository.

```sh
git clone https://github.com/ente-io/auth.git && cd auth
```

Move all of auth's files into `auth/`.

```sh
mkdir auth
git mv `find . -maxdepth 1 | grep -v -e '\.$' -e '\.\/.git$' -e '\.\/auth$'` auth
git commit -m 'Move into auth/'
```

Add photos-web as a new remote, and fetch its main.

```sh
git remote add photos-web https://github.com/ente-io/photos-web.git
git fetch photos-web main
```

Switch to main of web-photos.

```sh
git checkout -b photos-web-main photos-web/main
```

Move all of its files into `web` (note, the find now has an extra exclusion for
`web`, but we keep all the old ones too):

```sh
mkdir web
git mv `find . -maxdepth 1 | grep -v -e '^\.$' -e '^\.\/.git$' -e '^\.\/auth$' -e '^\.\/web$'` web
git commit -m 'Move into web/'
```

Switch back to main main, and merge the photos-web branch. The
`--allow-unrelated-histories` flag is needed (since these two branches don't
have any previous common ancestor).

```sh
git checkout main
git merge --allow-unrelated-histories photos-web-main
```

That's it. We then repeat this process for all the other repositories that we
need to bring in.

There is no magic involved here, so regular git commands will continue working.
However, all the files get renamed, so to track the git history prior to this
rename commit we'll need to pass the `--follow` flag.

    git log --follow -p -- auth/migration-guides/encrypted_export.md

For some file names like README.md which exist in multiple repositories, this
doesn't seem to work so good (I don't fully understand why). For example,
`git log --follow -p -- auth/README.md lists the changes to all the READMEs,
not just the auth README.md.

```sh

git clone https://github.com/ente-io/auth.git ente
cd ente

mkdir auth
git mv `find . -maxdepth 1 | grep -v -e '\.$' -e '\.\/.git$' -e '\.\/auth$'` auth
git commit -m 'Move into auth/'

git remote add photos-web https://github.com/ente-io/photos-web.git
git fetch photos-web main
git checkout -b photos-web-main photos-web/main

mkdir web
git mv `find . -maxdepth 1 | grep -v -e '^\.$' -e '^\.\/.git$' -e '^\.\/auth$' -e '^\.\/web$'` web
git commit -m 'Move into web/'

git checkout main
git merge --allow-unrelated-histories photos-web-main
git branch -D photos-web-main
git remote remove photos-web

git remote add photos-app https://github.com/ente-io/photos-app.git
git fetch photos-app main
git checkout -b photos-app-main photos-app/main

mkdir mobile
git mv `find . -maxdepth 1 | grep -v -e '^\.$' -e '^\.\/.git$' -e '^\.\/auth$' -e '^\.\/web$' -e '^\.\/mobile$'` mobile
git commit -m 'Move into mobile/'

git checkout main
git merge --allow-unrelated-histories photos-app-main
git branch -D photos-app-main
git remote remove photos-app

git remote add photos-desktop https://github.com/ente-io/photos-desktop.git
git fetch photos-desktop main
git checkout -b photos-desktop-main photos-desktop/main

mkdir desktop
git mv `find . -maxdepth 1 | grep -v -e '^\.$' -e '^\.\/.git$' -e '^\./.gitmodules$' -e '^\.\/desktop$'` desktop
git mv .gitmodules desktop
git commit -m 'Move into desktop/'

git checkout main
git merge --allow-unrelated-histories photos-desktop-main
git branch -D photos-desktop-main
git remote remove photos-desktop

git remote add cli https://github.com/ente-io/cli.git
git fetch cli main
git checkout -b cli-main cli/main

mkdir cli
git mv `find . -maxdepth 1 | grep -v -e '^\.$' -e '^\.\/.git$' -e '^\.\/cli$'` cli
git commit -m 'Move into cli/'

git checkout main
git merge --allow-unrelated-histories cli-main
git branch -D cli-main
git remote remove cli

git remote add docs https://github.com/ente-io/docs.git
git fetch docs main
git checkout -b docs-main docs/main

mkdir docs-1
git mv `find . -maxdepth 1 | grep -v -e '^\.$' -e '^\.\/.git$' -e '^\.\/docs-1$'` docs-1
git mv docs-1 docs
git commit -m 'Move into docs/'

git checkout main
git merge --allow-unrelated-histories docs-main
git branch -D docs-main
git remote remove docs
```
2024-03-01 13:01:41 +05:30
..
assets Move to monorepo 2024-03-01 13:01:41 +05:30
README.md Move to monorepo 2024-03-01 13:01:41 +05:30

Architecture

Your data is end-to-end encrypted with Ente.

Meaning, they are encrypted with your keys before they leave your device.

Our source code has been audited to verify that these keys are available only to you.

Meaning, only you can access your data.

What follows is an explanation of how we do what we do.

Key Encryption

Fundamentals

Master Key

When you sign up for Ente, your client generates a masterKey for you. This never leaves your device unencrypted.

Key Encryption Key

Once you choose a password, a keyEncryptionKey is derived from it. This never leaves your device.

Flows

Primary Device

During registration, your masterKey is encrypted with your keyEncryptionKey, and the resultant encryptedMasterKey is then sent to our servers for storage.

Secondary Device

When you sign in on a secondary device, after you successfully verify your email, our servers give you back your encryptedMasterKey that was sent to us by your primary device.

You are then prompted to enter your password. Once entered, your keyEncryptionKey is derived, and the client decrypts your encryptedMasterKey with this, to yield your original masterKey.

If the decryption fails, the client will know that the derived keyEncryptionKey was wrong, indicating an incorrect password, and this information will be surfaced to you.

Privacy

  • Since only you know your password, only you can derive your keyEncryptionKey.
  • Since only you can derive your keyEncryptionKey, only you have access to your masterKey.

Keep reading to learn about how this masterKey is used to encrypt your data.


Data Encryption

Fundamentals

Collection Key

Each of your items in Ente belong to what we call a collection. A collection can be either a folder (like "Camera" or "Screenshots") or an album (like "Awkward Reunion"). In case of Auth, we create a default root collection.

Each collection has a collectionKey. These never leave your device unencrypted.

File Key

Every piece of your data has a fileKey. These never leave your device unencrypted.

Flows

Upload

  • Each file and associated metadata is encrypted with randomly generated fileKeys.
  • Each fileKey is encrypted with the collectionKey of the collection (folder/album) the file belongs to. In case such a collection does not exist, one is created with a randomly generated collectionKey. All collection metadata (like name, folder-path, etc) are encrypted with this collectionKey.
  • Each collectionKey is then encrypted with your masterKey.
  • All of the above mentioned encrypted data is then pushed to the server for storage.

Download

  • All of the above mentioned encrypted data is pulled from the server.
  • You first decrypt each file's collectionKey with your masterKey.
  • You then decrypt each file's fileKey with their respective collectionKeys.
  • Finally, you decrypt each file and associated metadata with the respective fileKeys.

Privacy

  • As explained in the previous section, only you have access to your masterKey.
  • Since only you have access to your masterKey, only you can decrypt the collectionKeys.
  • Since only you have access to the collectionKeys, only you can decrypt the fileKeys.
  • Since only you have access to the fileKeys, only you can decrypt the files and their associated metadata.

Sharing

Fundamentals

Public Key

When you sign up for Ente, your app generates a publicKey for you. This is public, and is stored at our servers in plain text.

Verification ID

Verification ID is a human readable representation of a publicKey, that is accessible within the clients for verifying the identity of a receiver.

Private Key

Along with the publicKey, your app also generates a corresponding privateKey for you. This never leaves your device unencrypted.

The privateKey is encrypted with your masterKey that only you have access to. This encryptedPrivateKey is stored at our servers

Flow

Sharing is similar to the previous section, except that the collectionKey of a collection is shared with a receiver after encrypting it with the receiver's publicKey. To elaborate,

Sender

  • Each file and associated metadata was already encrypted with randomly generated fileKeys.
  • Each of these fileKeys were also encrypted with the collectionKey of the collection (folder/album) that is now being shared.
  • The collectionKey is now encrypted with the publicKey of the receiver.
  • All of the above mentioned encrypted data is then pushed to the server for storage.

Receiver

  • All of the above mentioned encrypted data is pulled from the server.
  • The receiver first decrypts the collectionKey with their privateKey.
  • They then decrypt each file's fileKey with their respective collectionKeys.
  • Finally, they decrypt each file and associated metadata with the respective fileKeys.

Privacy

  • Since only the receiver has access to their masterKey, only they can decrypt their encryptedPrivateKey to access their privateKey.
  • Since only the receiver has access to their privateKey, only they can decrypt the collectionKey that was sent to them.
  • Since only the receiver has access to the collectionKey, only they can decrypt the fileKeys of files belonging to that album/folder.
  • Since only the receiver has access to the fileKeys of files belonging to that album/folder, only they can decrypt the files and associated metadata.

A sender can view the Verification ID of the receiver within the app's sharing screen, and compare this with the Verification ID displayed on the receiver's device. The two identifiers matching across devices verifies the security of end-to-end encryption between the two parties.


Key Recovery

Fundamentals

Recovery Key

When you sign up for Ente, your app generates a recoveryKey for you. This never leaves your device unencrypted.

Flow

Storage

Your recoveryKey and masterKey are encrypted with each other and stored on the server.

Access

This encrypted recoveryKey is downloaded when you sign in on a new device. This is decrypted with your masterKey and surfaced to you whenever you request for it.

Recovery

Post email verification, if you're unable to unlock your account because you have forgotten your password, the client will prompt you to enter your recoveryKey.

The client then pulls the masterKey that was earlier encrypted and pushed to the server (as discussed in Key Encryption, and decrypts it with the entered recoveryKey. If the decryption succeeds, the client will know that you have entered the correct recoveryKey.

Now that you have your masterKey, the client will prompt you to set a new password, using which it will derive a new keyEncryptionKey. This is then used to encrypt your masterKey and this new encryptedMasterKey is uploaded to our servers, similar to what was earlier discussed in Key Encryption.

Privacy

  • Since only you have access to your masterKey, only you can access your recoveryKey.
  • Since only you can access your recoveryKey, only you can reset your password.

Authentication

Fundamentals

One Time Token

When you attempt to verify ownership of an email address, our server generates a oneTimeToken, that if presented confirms your access to the said email address. This token is valid for a short time and can only be used once.

Authentication Token

When you successfully authenticate yourself against our server by proving ownership of your email (and in future any other configured vectors), the server generates an authToken, that can from there on be used to authenticate against our private APIs.

Encrypted Authentication Token

A generated authToken is returned to your client after being encrypted with your publicKey. This encryptedAuthToken can only be decrypted with your privateKey.

Flow

  • You are asked for an email address, to which a oneTimeToken is sent.
  • Once you present this information correctly to our server, an authToken is generated and an encryptedAuthToken is returned to you, along with your other encrypted keys.
  • You are then prompted to enter your password, using which your masterKey is derived (as discussed here).
  • Using this masterKey, the rest of your keys, including your privateKey is decrypted (as discussed here).
  • Using your privateKey, the client will then decrypt the encryptedAuthToken that was earlier encrypted by our server with your publicKey.
  • This decrypted authToken can then from there on be used to authenticate all API calls against our servers.

Security

Only by verifying access to your email and knowing your password can you obtain an authToken that can be used to authenticate yourself against our servers.


Implementation Details

We rely on the high level APIs exposed by this wonderful library called libsodium.

Key Generation

crypto_secretbox_keygen is used to generate all random keys within the application. Your masterKey, recoveryKey, collectionKey, fileKey are all 256-bit keys generated using this API.

Key Pair Generation

crypto_box_keypair is used to generate your publicKey and privateKey pairs.

Key Derivation

crypto_pwhash is used to derive your keyEncryptionKey from your password.

crypto_pwhash_OPSLIMIT_SENSITIVE and crypto_pwhash_MEMLIMIT_SENSITIVE are used as the limits for computation and memory respectively. If the operation fails due to insufficient memory, the former is doubled and the latter is halved progressively, until a key can be derived. If during this process the memory limit is reduced to a value less than crypto_pwhash_MEMLIMIT_MIN, the client will not let you register from that device.

Internally, this uses Argon2 v1.3, which is regarded as one of the best hashing algorithms currently available.

Symmetric Encryption

crypto_secretbox_easy is used to encrypt your masterKey, recoveryKey, privateKey, collectionKeys and fileKeys. Internally, this uses XSalsa20 stream cipher with Poly1305 MAC for authentication.

crypto_secretstream_* APIs are used to encrypt your file data in chunks. Internally, this uses XChaCha20 stream cipher with Poly1305 MAC for authentication.

Asymmetric Encryption

crypto_box_seal is used in sharing to encrypt a collectionKey with the receiver's publicKey. It is also used to encrypt an authToken that is issued to a user, with their publicKey.

Internally, this uses X25519 for key exchange, XSalsa20 stream cipher for encryption and Poly1305 MAC for authentication.

Salt & Nonce Generation

randombytes_buf is used to generate a new salt/nonce every time data needs to be hashed/encrypted.

Verification ID Generation

Verification ID is generated by converting the sha256 value of a publicKey to it's corresponding BIP39 mnemonic phrase.


Further Details

Thank you for reading this far! For implementation details, we request you to checkout our code.

If you'd like to help us improve this document, kindly email security@ente.io.

We have a separate document that outlines how we replicate your data across 3 different cloud providers to ensure reliability.