Page 1 of 2 12 LastLast
Results 1 to 10 of 16

Thread: Replace duplicate attachments through hardlinks?

  1. #1
    Member
    Join Date
    Jul 2008
    Location
    Saarbruecken, Germany
    Posts
    79

    Replace duplicate attachments through hardlinks?

    Hi there,

    as single instance only works for new attachments which come via lmtp>dagent into zarafa, I am looking for a way to "single instance" old attachments.

    With standard files on a file system, I'd just do a duplicate check and replace duplicate files with a hardlink to the existing one (using for example "fdupes"). Can I do that with Zarafa attachments, too?

    Thanks,
    Marco
    Zarafa Gold Partner and Linux Solutions - http://www.inett.de/

  2. #2
    Senior Member
    Join Date
    Jan 2009
    Location
    Hanover, Germany
    Posts
    839
    Sounds like an interesting approach. I had a quick run of hardlink through my /var/lib/zarafa/, as I had it already installed. I quick test of "would-be-linked" files showed the exact same content in both files.

    Code:
    hardlink -tnv /var/lib/zarafa -x index
    ...
    Mode:     dry-run
    Files:    6710
    Linked:   2327 files
    Compared: 46873 files
    Saved:    135.48 MiB
    Duration: 2.24 seconds
    Last edited by fbartels; 03-05-2012 at 06:46 AM. Reason: Exclude index folder
    Regards Felix

    Zarafa ALPHA/BETA/RC feedback in BETA forum please.
    Zarafa IRC chat: irc.freenode.com > #zarafa
    Zarafa documentation: http://www.zarafa.com/content/documentation

  3. #3
    Member
    Join Date
    Jul 2008
    Location
    Saarbruecken, Germany
    Posts
    79
    The question is: Will an attachment ever change without replacing it by a new one?

    For example:

    - Mail1 has Attachment1
    - Mail2 has Attachment2 (same as Attachment1)

    replacing Attachment2 with a hardlink to Attachment1 ist easy. Now both files are the same physical file.

    Scenario 1: Attachment is a Word file and gets changed within Outlook with saving the attachment afterwards. What happens then?

    - Will the old attachment be changed? This means that also the linked one will be changed. This can lead to data corruption as the original one will be gone.
    - Will the old attachment be deleted, the new one saved as a new attachment? Then it will be ok and two versions will exist (the original one unchanged, but with only remaining hardlink and the new one, changed)

    So, any hints what will happen? Will this always happen or are there chances that it's done so and so depending on the way?

    Best regards,
    Marco
    Zarafa Gold Partner and Linux Solutions - http://www.inett.de/

  4. #4
    Member
    Join Date
    Jul 2008
    Location
    Saarbruecken, Germany
    Posts
    79
    Any new opinions on this?

    Best regards,
    Marco
    Zarafa Gold Partner and Linux Solutions - http://www.inett.de/

  5. #5
    I have tried sending an email to multiple users. Then I applied hardlink to those attachments.
    After that i changed the attachment in the original email and the other users still had the old attachment in their mail without errors.
    Also if one of the users alters their attachment it is also no problem.

    Looks like a new file is created for the altered email and the old file + hardlinks are kept for the other mails.

    As seen that the attachments are saved in subfolders /0/0/ /0/1 etc i am going to convert one of the subfolders and see if there are any problems before doing the rest.
    With the dry run I see that I can save a total of 50GB of space.

    Best regards,

    Theo
    Currently using ZCP 7.1.8
    Outlook 2010, Windows XP and Firefox 26
    Active directory windows server 2003
    SuSe sles 11.2
    Z-Push 2.1 being used with Android and Iphone

  6. #6
    Senior Member
    Join Date
    Sep 2009
    Location
    Munster/Germany
    Posts
    157
    that sounds nice!
    Reducing the space of the attachment store would be a nice feature
    current system : ZARAFA 6.40.17 on Ubuntu 8.04 LTS (x64)
    user base : Windows 2003 R2 (x32) with ADS-plugin ...
    backups are overvalued ...
    restore is important ...

  7. #7
    I ran hardlink on the whole attachment folder, saved 50GB and no problems at all so far.

    If you don't have hardlink, download and compile it from jak-linux.org/projects/hardlink

    Theo
    Last edited by Helpdeskzandvoort; 26-04-2013 at 02:36 PM.
    Currently using ZCP 7.1.8
    Outlook 2010, Windows XP and Firefox 26
    Active directory windows server 2003
    SuSe sles 11.2
    Z-Push 2.1 being used with Android and Iphone

  8. #8
    Senior Member
    Join Date
    Sep 2009
    Location
    Munster/Germany
    Posts
    157
    current folder size:
    root@srvzarafa:/var/lib/zarafa# du -hs
    19G

    root@srvzarafa:/home/support/# hardlink -tn /var/lib/zarafa -x index
    Mode: dry-run
    Files: 87823
    Linked: 39040 files
    Compared: 93680 files
    Saved: 6,28 GiB
    Duration: 8690,73 seconds

    that sounds nice nearly 30% space saved !

    the duration is due to a run while office-time
    should be faster in the evening

    Think, I'll give it a chance
    current system : ZARAFA 6.40.17 on Ubuntu 8.04 LTS (x64)
    user base : Windows 2003 R2 (x32) with ADS-plugin ...
    backups are overvalued ...
    restore is important ...

  9. #9
    Junior Member
    Join Date
    Jul 2012
    Posts
    1
    Hi guys,

    elmuchacho: What were the results of your operation with hardlink? Is it working as expected?

    I'm interested in this subject, however I wonder if it can be used in production environment. I'm using RHEL 6.3 with Zarafa 7.1.4 and I installed hardlink package from rhel-6-server-rpms repo, but I noticed it is quite different program - wrote by Jakub Jelinek, while the one you tested was written by Julian Andres Klode and it gives quite different output when run in simulation mode:

    [root@zarafa_main ~]# hardlink -cnv /var/lib/zarafa/attachments
    Directories 212
    Objects 4574485
    IFREG 4574273
    Mmaps 3546517
    Comparisons 3547604
    Would link 3545686
    Would save 136044175360

    I wonder if anyone tested RHEL version of hardlink and can confirm that there are no issues with it.

    Regards,
    Marcin

  10. #10
    Junior Member
    Join Date
    Sep 2013
    Posts
    13
    I see developers unaware that Zarafa fails to have Single Instance Attachment storage. Though it is one step from the target...
    Too bad

Similar Threads

  1. Can Zarafa replace my Fetchmail/Dovecot/Postfix setup?
    By philled in forum Administration and Integration
    Replies: 10
    Last Post: 05-09-2011, 02:53 PM
  2. Replace the authentication form by apache mod_ldap module
    By Dtouzeau in forum WebAccess usage
    Replies: 3
    Last Post: 08-08-2011, 07:47 PM
  3. zarafa-licensed hardlinks to 0.9.8 libcrypto/libssl
    By dR0PS in forum Installation and Configuration Archives
    Replies: 4
    Last Post: 22-12-2010, 11:41 AM
  4. Replace default webaccess logon page with Apache Authtype
    By ninja76 in forum Administration and Integration Archives
    Replies: 8
    Last Post: 18-05-2010, 09:57 AM
  5. Duplicate entry
    By sasos in forum Installation and Configuration Archives
    Replies: 8
    Last Post: 03-04-2010, 02:40 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •