Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Siju George <sgeorge.ml <at> gmail.com>
Subject: Real World DragonFlyBSD Hammer DeDup figures - Reclaiming more than 1/4th ( 30% ) Disk Space from an Almost Full Drive
Newsgroups: gmane.os.bsd.india
Date: Tuesday 19th July 2011 11:12:31 UTC (over 6 years ago)
Hi,


One of the DragonFlyBSD Backup Server has around 10 years of Company
 Archives.
This is the result of de-dup feature

Short Sumary before dedup of firtst Hard Disk

Filesystem                Size   Used  Avail Capacity  Mounted
on
Backup1                   454G   451G   2.8G    99%  
 /Backup1

Short Sumary after dedup of firtst Hard Disk

Filesystem                Size   Used  Avail Capacity  Mounted
on
Backup1                   454G   313G   141G    69%  
 /Backup1

Reclaimed 138 GB i.e 30% of Disk space without deleting anything or
considerably affecting the perfomance of the Server.

Full Story:

The first backups server was Debian Sarge, then Debian Etch and then
OpenBSD with RAIDFRAME mirrors because it was the only Unix/Linux that
would even detect the 120 GB hard disks we had back then.
Later I turned to DragonFlyBSD due to HAMMER ( No fsck, No RAID Parity
chceks and Easy FS Snapshots )
So this Dragonfly backup server has around 10 years old backups of

1) Web files of Projects ( html, php, images etc )

2) SQL dumps both zipped and unzipped .Hammer snapshots gave me the
luxury to do

http://www.dragonflybsd.org/docs/real_time_backup_server_for_microsoft_windows__44___linux__44___bsd_and_mac_os_x_clients/

But now we have SQL dumps of induvidual databses taken every hour and
made available to the developers using snapshots in the same manner
:-)

3) MS Word, Excell Doc files - Company documents and User backups

4) PSD files and such from Designers which takes a larg space.

5) Git, SVN repositories backup

6) Virtual Machine images ( mostly qcow2 )

7) Configuration files of several servers and other details backuped
daily/hourly os some times every 15 minutes and maintained with coarse
grained snapshots without pruning.

8) Several Softwares and CD ISO images

9) Video/Audio files such as mp3,avi.flv,mpg and so on.


The OS version currently is

DragonFly v2.11.0.247.gda17d9-DEVELOPMENT

 Processor is

AMD Athlon(tm) 64 Processor 3400+ (2193.63-MHz 686-class CPU)

Memory is

real memory  = 2113336320 (2015 MB)
avail memory = 2029342720 (1935 MB)

with four 500GB SATA Disks mirroring PFS from each other and also from
another Dragonfly Backup Server on a differrent floor using
'mirror-stream' started at boot using cron with an entry similar to

@reboot /sbin/hammer mirror-stream /Backup1/Data /Backup2/Data &


I have never reinstalled the OS but kept following the development
version from July 2009 so that is two years of rolling release which
is a great advantage in itself :-)

The first Disk is mounted as /Backup1 and seems to be a good Candidate
for dedup because it is almost full.

======================================================================================
Filesystem                Size   Used  Avail Capacity  Mounted
on

Backup1                   454G   451G   2.8G    99%  
 /Backup1
/Backup1/pfs/@@-1:00001   454G   451G   2.8G    99%    /Backup1/Data
/Backup1/pfs/@@-1:00009   454G   451G   2.8G    99%  
 /Backup1/pkgsrc
/Backup1/pfs/@@-1:00002   454G   451G   2.8G    99%  
 /Backup1/VersionControl
/Backup1/pfs/@@-1:00003   454G   451G   2.8G    99%    /Backup1/test
/Backup1/pfs/@@-1:00005   454G   451G   2.8G    99%
/Backup1/www-5mbak/www-hot
/Backup1/pfs/@@-1:00006   454G   451G   2.8G    99%
/Backup1/mysql-1hbak/mysql-hot
/Backup1/pfs/@@-1:00007   454G   451G   2.8G    99%
/Backup1/project-docs-bak/project-docs
=======================================================================================

Full Details below.

=========================================================

       Label               Backup1
       No. Volumes         1
       FSID              
 e182...............................................
       HAMMER Version      4
Big block information
       Total           58140
       Used            57713 (99.27%)
       Reserved           69 (0.12%)
       Free              358 (0.62%)
Space information
       No. Inodes   11350364
       Total size       454G (487713669120 bytes)
       Used             451G (99.27%)
       Reserved         552M (0.12%)
       Free             2.8G (0.62%)
PFS information
       PFS ID  Mode    Snaps  Mounted on
            0  MASTER      0  /Backup1
            1  MASTER      0  /Backup1/Data
            2  MASTER      0  /Backup1/VersionControl
            3  MASTER      0  /Backup1/test
            5  MASTER      0  /Backup1/www-5mbak/www-hot
            6  MASTER      0  /Backup1/mysql-1hbak/mysql-hot
            7  MASTER      0
 /Backup1/project-docs-bak/project-docs
            9  MASTER      0  /Backup1/pkgsrc
==========================================================


De Duping Steps Taken:
----------------------------------


1) Version Upgrading from 4 to 6.

=================================
dfly-bkpsrv# hammer version-upgrade /Backup1 5
hammer version-upgrade: succeeded
dfly-bkpsrv# hammer version-upgrade /Backup1 6
hammer version-upgrade: succeeded
=================================

2) Simulating using 'dedup-simulate' to get an idea.

=====================================================================================

dfly-bkpsrv# hammer dedup-simulate /Backup1
Dedup-simulate /Backup1: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 0
Dedup-simulate /Backup1 succeeded
Simulated dedup ratio = 1.07

dfly-bkpsrv# hammer dedup-simulate /Backup1/Data
Dedup-simulate /Backup1/Data: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 1
Dedup-simulate /Backup1/Data succeeded
Simulated dedup ratio = 1.34

dfly-bkpsrv# hammer dedup-simulate /Backup1/pkgsrc
Dedup-simulate /Backup1/pkgsrc: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 9
Dedup-simulate /Backup1/pkgsrc succeeded
Simulated dedup ratio = 1.10

dfly-bkpsrv# hammer dedup-simulate /Backup1/VersionControl
Dedup-simulate /Backup1/VersionControl: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 2
Dedup-simulate /Backup1/VersionControl succeeded
Simulated dedup ratio = 2.79

dfly-bkpsrv# hammer dedup-simulate /Backup1/test
Dedup-simulate /Backup1/test: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 3
Dedup-simulate /Backup1/test succeeded
Simulated dedup ratio = 0.00

dfly-bkpsrv# hammer dedup-simulate /Backup1/www-5mbak/www-hot
Dedup-simulate /Backup1/www-5mbak/www-hot: objspace
8000000000000000:0000 7fffffffffffffff:ffff pfs_id 5
Dedup-simulate /Backup1/www-5mbak/www-hot succeeded
Simulated dedup ratio = 1.39

dfly-bkpsrv# hammer dedup-simulate /Backup1/mysql-1hbak/mysql-hot
Dedup-simulate /Backup1/mysql-1hbak/mysql-hot: objspace
8000000000000000:0000 7fffffffffffffff:ffff pfs_id 6
Dedup-simulate /Backup1/mysql-1hbak/mysql-hot succeeded
Simulated dedup ratio = 13.78

dfly-bkpsrv# hammer dedup-simulate /Backup1/project-docs-bak/project-docs
Dedup-simulate /Backup1/project-docs-bak/project-docs: objspace
8000000000000000:0000 7fffffffffffffff:ffff pfs_id 7
Dedup-simulate /Backup1/project-docs-bak/project-docs succeeded
Simulated dedup ratio = 1.15

===================================================================================================

3) Real 'de-dup' of the Mother File System and all PFSes

=======================================================================

dfly-bkpsrv# hammer dedup /Backup1
Dedup /Backup1: objspace 8000000000000000:0000 7fffffffffffffff:ffff pfs_id
0
Dedup /Backup1 succeeded
Dedup ratio = 1.07
     625 MB referenced
     585 MB allocated
     224 KB skipped
          0 CRC collisions
          0 SHA collisions
          0 bigblock underflows

dfly-bkpsrv# hammer dedup /Backup1/Data
Dedup /Backup1/Data: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 1
Dedup /Backup1/Data succeeded
Dedup ratio = 1.34
     259 GB referenced
     193 GB allocated
      40 MB skipped
       1944 CRC collisions
          0 SHA collisions
         20 bigblock underflows

dfly-bkpsrv# hammer dedup /Backup1/pkgsrc
Dedup /Backup1/pkgsrc: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 9
Dedup /Backup1/pkgsrc succeeded
Dedup ratio = 1.10
    1687 MB referenced
    1539 MB allocated
    1718 KB skipped
          3 CRC collisions
          0 SHA collisions
          0 bigblock underflows

dfly-bkpsrv# hammer dedup /Backup1/VersionControl
Dedup /Backup1/VersionControl: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 2
Dedup /Backup1/VersionControl succeeded
Dedup ratio = 2.75
     160 MB referenced
      58 MB allocated
     853 KB skipped
          0 CRC collisions
          0 SHA collisions
          0 bigblock underflows

dfly-bkpsrv# hammer dedup /Backup1/test
Dedup /Backup1/test: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 3
Dedup /Backup1/test succeeded
Dedup ratio = 0.00
        0 B referenced
        0 B allocated
        0 B skipped
          0 CRC collisions
          0 SHA collisions
          0 bigblock underflows

dfly-bkpsrv# hammer dedup /Backup1/www-5mbak/www-hot
Dedup /Backup1/www-5mbak/www-hot: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 5
Dedup /Backup1/www-5mbak/www-hot succeeded
Dedup ratio = 1.39
      50 GB referenced
      36 GB allocated
      53 MB skipped
        167 CRC collisions
          0 SHA collisions
          0 bigblock underflows

Dedup /Backup1/mysql-1hbak/mysql-hot: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 6
Dedup /Backup1/mysql-1hbak/mysql-hot succeeded
Dedup ratio = 13.78
      117 GB referenced
    8747 MB allocated
        0 B skipped
          0 CRC collisions
          0 SHA collisions
          0 bigblock underflows

dfly-bkpsrv# hammer dedup /Backup1/project-docs-bak/project-docs
Dedup /Backup1/project-docs-bak/project-docs: objspace
8000000000000000:0000 7fffffffffffffff:ffff pfs_id 7
Dedup /Backup1/project-docs-bak/project-docs succeeded
Dedup ratio = 1.15
     247 MB referenced
     215 MB allocated
     102 KB skipped
          0 CRC collisions
          0 SHA collisions
          0 bigblock underflows
=================================================================================================

Full info of de-duped volume

=======================================================================
Volume identification
	Label               Backup1
	No. Volumes         1
	FSID                e1859f6a-6ab8-11de-9bc4-011617202aa6
	HAMMER Version      6
Big block information
	Total           58140
	Used            40032 (68.85%)
	Reserved           69 (0.12%)
	Free            18039 (31.03%)
Space information
	No. Inodes   11371863
	Total size       454G (487713669120 bytes)
	Used             313G (68.85%)
	Reserved         552M (0.12%)
	Free             141G (31.03%)
=====================================================================

Now after de-duping all PFSes on First Disk a 'df -h' gives this details

Filesystem                Size   Used  Avail Capacity  Mounted
on
Backup1                   454G   313G   141G    69%  
 /Backup1

Before de-duping it was

Filesystem                Size   Used  Avail Capacity  Mounted
on
Backup1                   454G   451G   2.8G    99%  
 /Backup1

So that is reclaiming 30% of Disk space amounting to 138 GB :-)

Carefull configuring designing PFSes and snapshots can save a lot of Disk
space.
But de-dup can still save more :-)


In order to 'de-dup' the file system automatically every day using
'hammer cleanup' in the periodic script I have put some thing like
this in the configuration files for PFSes.

=============================================
dfly-bkpsrv# hammer config /Backup1/VersionControl/
snapshots 1d 1000d
prune     1d 15m
rebalance 1d 5m
reblock   1d 60m
recopy    30d 60m
dedup     1d 30m
==============================================

A million thanks to Matt and team for DragonFly, Hammer, de-dup,
vkernel and a lot of other gooddies comming up :-D

Thanks and Regards

--Siju
 
CD: 5ms