Improve
Your
old
Storage
with
bcachefs
Marian Marinov <mm@yuhu.biz>
W
ho Am
I?
W
ho Am
I?
I want to share a story...
The story
of some
hard
drives...
➢ 2 years ago I upgraded my home storage
➢ 2 years ago I upgraded my home storage
➢ from all HDD to all SSD drives
you all can speculate
as to why I did that
Let's take a look at
how HDD work
Let's take a look at
how HDD work
Platters
The
SLOWEST
task of a
rotational
drive
Optimization
➢ File systems tried to optimize for this
inefficiency
Optimization
➢ File systems tried to optimize for this
inefficiency
➢ write contents of the files within the same cylinder
➢ utilizing the space under the heads on all platters
Optimization
➢ File systems tried to optimize for this
inefficiency
➢ write contents of the files within the same cylinder
➢ utilizing the space under the heads on all platters
➢ continue the writes on to next sector
(try not to skip any, reducing fragmentation)
Optimization
➢ File systems tried to optimize for this
inefficiency
➢ write contents of the files within the same cylinder
➢ utilizing the space under the heads on all platters
➢ continue the writes on to next sector
(try not to skip any, reducing fragmentation)
➢ continue directly on to the next track
Optimization
➢ File systems tried to optimize for this
inefficiency
➢ The Linux kernel tried to optimize the reads
and writes
➢ read-ahead
➢ combine and order the writes
Optimization
➢ File systems tried to optimize for this
inefficiency
➢ The Linux kernel tried to optimize the reads
and writes
➢ Storage controllers
➢ write-behind with BBU
Now comes bcache
Oh NO, this is not bcachefs
Now comes bcache
➢ Its 2011 and StorPool are just starting
➢ A guy named Kent Overstreet
➢ SSD booming
➢ Storage vendors starting to offer things like
➢ cachecade
➢ cachevault
➢ etc.
Bcache is born
➢ Bcache is simple block level cache
➢ 2013 it is included in kernel 3.10
➢ It has some bugs :)
Bcache
➢ Bcache is the ground work for bcachefs
➢ It only knows about the blocks, but not their
use
➢ It orders the writes so the backing rotational
disk will do more sequential writes
Finally bcachefs
bcachefs was born in 2014-2015
➢current features
➢ Copy on write (COW) - like zfs or btrfs
➢ Full data checksumming
➢ Caching
➢ Compression (LZ4,ZSTD,)
➢ Replication (RAID1/10)
The juicy details
AVG MIN MAX
sw raid1 86 80 93 MB/s
bcachefs (1) 93 90 97 MB/s
bcachefs (2) 123 118 128 MB/s
36 38 25
1. bcachefs with only rotational HDDs
2. bcachefs with SSD backed device
dd if=/dev/zero of=test$i oflag=direct count=200 bs=5M
Fun facts
root@firefly:~# mount /dev/sdc1 /bcache/
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1 639G 1.3M 629G 1% /bcache
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdd1
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1:/dev/dev-1 1.3T 1.3M 1.3T 1% /bcache
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sde1
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdf1
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted
on
/dev/sdc1:/dev/dev-1:/dev/dev-2:/dev/dev-3 2.6T 1.5M 2.5T 1% /bcache
Fun facts
root@firefly:~# mount /dev/sdc1 /bcache/
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1 639G 1.3M 629G 1% /bcache
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdd1
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1:/dev/dev-1 1.3T 1.3M 1.3T 1% /bcache
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sde1
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdf1
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted
on
/dev/sdc1:/dev/dev-1:/dev/dev-2:/dev/dev-3 2.6T 1.5M 2.5T 1% /bcache
Installation
➢ clone their kernel and build it
https://evilpiepirate.org/git/bcachefs.git
➢ clone bcachefs-tools and build them
https://evilpiepirate.org/git/bcachefs-tools.git
First steps
➢ Prepare your drives (as partitions or whole)
➢ bcachefs format /dev/sdc1
➢ bcachefs format /dev/sdd1
➢ bcachefs format /dev/sde
➢ bcachefs format /dev/sdf
➢ mount
mount -t bcachefs 
/dev/sdc1:/dev/sdd1:/dev/sde:/dev/sdf 
/bcache
Make it faster
➢ Let's assume sdb is our SSD drive
# bcachefs format /dev/sdb
# bcachefs device add --group=hdd2 /bcache
/dev/sdb
➢ We still haven't done anything...
# cd /sys/fs/bcachefs/UUID/options
# echo hdd2 > promote_target
# echo hdd2 > foreground_target
# echo hdd2 > metadata_target
Make it faster
➢ foreground_target
➢ writes are first cached here
Make it faster
➢ foreground_target
➢ writes are first cached here
➢ promote_target
➢ reads are cached on this device
Make it faster
➢ You have multiple options for compression
# cd /sys/fs/bcachefs/UUID/options
# cat background_compression
[none] lz4 gzip zstd
➢ Documentation states to avoid zstd for now
➢ also keep in mind block alignment
Make it redundant
➢ Make two copies of the data
# cd /sys/fs/bcachefs/UUID/options
# echo 2 > data_replicas
# echo 2 > metadata_replicas
You can do it all at once
# bcachefs format 
--group=ssd /dev/sdb 
--group=hdd /dev/sdc /dev/sdd /dev/sde /dev/sdf 
--data_replicas=2 --metadata_replicas=2 
--foreground_target=ssd 
--background_target=hdd 
--promote_target=ssd
# mount -t bcachefs 
/dev/sdb:/dev/sdc:/dev/sdd/dev/sde:/dev/sdf /mnt
Some management stuff
➢ Remove a device
# bcachefs device evacuate /dev/sdf1
# bcachefs device remove /dev/sdf1
# bcachefs device offline /dev/sdf1
➢ Bring back a device
# bcachefs device online /dev/sdf1
# bcachefs device add /dev/sdf1
➢ Make sure all of your data is replicated
# bcachefs data rereplicate /bcache/
QUESTIONS?
Marian Marinov <mm@yuhu.biz>
Thank You
Marian Marinov <mm@yuhu.biz>

Improve your storage with bcachefs

  • 1.
  • 2.
  • 3.
    I want toshare a story...
  • 4.
  • 5.
    ➢ 2 yearsago I upgraded my home storage
  • 6.
    ➢ 2 yearsago I upgraded my home storage ➢ from all HDD to all SSD drives
  • 7.
    you all canspeculate as to why I did that
  • 8.
    Let's take alook at how HDD work
  • 9.
    Let's take alook at how HDD work Platters
  • 11.
  • 13.
    Optimization ➢ File systemstried to optimize for this inefficiency
  • 14.
    Optimization ➢ File systemstried to optimize for this inefficiency ➢ write contents of the files within the same cylinder ➢ utilizing the space under the heads on all platters
  • 15.
    Optimization ➢ File systemstried to optimize for this inefficiency ➢ write contents of the files within the same cylinder ➢ utilizing the space under the heads on all platters ➢ continue the writes on to next sector (try not to skip any, reducing fragmentation)
  • 16.
    Optimization ➢ File systemstried to optimize for this inefficiency ➢ write contents of the files within the same cylinder ➢ utilizing the space under the heads on all platters ➢ continue the writes on to next sector (try not to skip any, reducing fragmentation) ➢ continue directly on to the next track
  • 17.
    Optimization ➢ File systemstried to optimize for this inefficiency ➢ The Linux kernel tried to optimize the reads and writes ➢ read-ahead ➢ combine and order the writes
  • 18.
    Optimization ➢ File systemstried to optimize for this inefficiency ➢ The Linux kernel tried to optimize the reads and writes ➢ Storage controllers ➢ write-behind with BBU
  • 19.
  • 20.
    Oh NO, thisis not bcachefs Now comes bcache
  • 22.
    ➢ Its 2011and StorPool are just starting ➢ A guy named Kent Overstreet ➢ SSD booming ➢ Storage vendors starting to offer things like ➢ cachecade ➢ cachevault ➢ etc.
  • 23.
    Bcache is born ➢Bcache is simple block level cache ➢ 2013 it is included in kernel 3.10 ➢ It has some bugs :)
  • 24.
    Bcache ➢ Bcache isthe ground work for bcachefs ➢ It only knows about the blocks, but not their use ➢ It orders the writes so the backing rotational disk will do more sequential writes
  • 25.
    Finally bcachefs bcachefs wasborn in 2014-2015 ➢current features ➢ Copy on write (COW) - like zfs or btrfs ➢ Full data checksumming ➢ Caching ➢ Compression (LZ4,ZSTD,) ➢ Replication (RAID1/10)
  • 26.
    The juicy details AVGMIN MAX sw raid1 86 80 93 MB/s bcachefs (1) 93 90 97 MB/s bcachefs (2) 123 118 128 MB/s 36 38 25 1. bcachefs with only rotational HDDs 2. bcachefs with SSD backed device dd if=/dev/zero of=test$i oflag=direct count=200 bs=5M
  • 27.
    Fun facts root@firefly:~# mount/dev/sdc1 /bcache/ root@firefly:~# df -h /bcache Filesystem Size Used Avail Use% Mounted on /dev/sdc1 639G 1.3M 629G 1% /bcache root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdd1 root@firefly:~# df -h /bcache Filesystem Size Used Avail Use% Mounted on /dev/sdc1:/dev/dev-1 1.3T 1.3M 1.3T 1% /bcache root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sde1 root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdf1 root@firefly:~# df -h /bcache Filesystem Size Used Avail Use% Mounted on /dev/sdc1:/dev/dev-1:/dev/dev-2:/dev/dev-3 2.6T 1.5M 2.5T 1% /bcache
  • 28.
    Fun facts root@firefly:~# mount/dev/sdc1 /bcache/ root@firefly:~# df -h /bcache Filesystem Size Used Avail Use% Mounted on /dev/sdc1 639G 1.3M 629G 1% /bcache root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdd1 root@firefly:~# df -h /bcache Filesystem Size Used Avail Use% Mounted on /dev/sdc1:/dev/dev-1 1.3T 1.3M 1.3T 1% /bcache root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sde1 root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdf1 root@firefly:~# df -h /bcache Filesystem Size Used Avail Use% Mounted on /dev/sdc1:/dev/dev-1:/dev/dev-2:/dev/dev-3 2.6T 1.5M 2.5T 1% /bcache
  • 29.
    Installation ➢ clone theirkernel and build it https://evilpiepirate.org/git/bcachefs.git ➢ clone bcachefs-tools and build them https://evilpiepirate.org/git/bcachefs-tools.git
  • 30.
    First steps ➢ Prepareyour drives (as partitions or whole) ➢ bcachefs format /dev/sdc1 ➢ bcachefs format /dev/sdd1 ➢ bcachefs format /dev/sde ➢ bcachefs format /dev/sdf ➢ mount mount -t bcachefs /dev/sdc1:/dev/sdd1:/dev/sde:/dev/sdf /bcache
  • 31.
    Make it faster ➢Let's assume sdb is our SSD drive # bcachefs format /dev/sdb # bcachefs device add --group=hdd2 /bcache /dev/sdb ➢ We still haven't done anything... # cd /sys/fs/bcachefs/UUID/options # echo hdd2 > promote_target # echo hdd2 > foreground_target # echo hdd2 > metadata_target
  • 32.
    Make it faster ➢foreground_target ➢ writes are first cached here
  • 33.
    Make it faster ➢foreground_target ➢ writes are first cached here ➢ promote_target ➢ reads are cached on this device
  • 34.
    Make it faster ➢You have multiple options for compression # cd /sys/fs/bcachefs/UUID/options # cat background_compression [none] lz4 gzip zstd ➢ Documentation states to avoid zstd for now ➢ also keep in mind block alignment
  • 35.
    Make it redundant ➢Make two copies of the data # cd /sys/fs/bcachefs/UUID/options # echo 2 > data_replicas # echo 2 > metadata_replicas
  • 36.
    You can doit all at once # bcachefs format --group=ssd /dev/sdb --group=hdd /dev/sdc /dev/sdd /dev/sde /dev/sdf --data_replicas=2 --metadata_replicas=2 --foreground_target=ssd --background_target=hdd --promote_target=ssd # mount -t bcachefs /dev/sdb:/dev/sdc:/dev/sdd/dev/sde:/dev/sdf /mnt
  • 37.
    Some management stuff ➢Remove a device # bcachefs device evacuate /dev/sdf1 # bcachefs device remove /dev/sdf1 # bcachefs device offline /dev/sdf1 ➢ Bring back a device # bcachefs device online /dev/sdf1 # bcachefs device add /dev/sdf1 ➢ Make sure all of your data is replicated # bcachefs data rereplicate /bcache/
  • 38.
  • 39.