The challenges with large attachments in Confluence

Challenges

Large attachment presents a few challenges to Confluence admins:

  • Possible performance issues

  • Ever-growing disk usage and hosting costs

  • Longer time to perform backups and upgrades

Possible performance issues with very large attachments

The following activities may be resource intensive with very large attachments:


Tracking large attachment uploads

With Large Attachment Tracker, admins can:

  • Check if performance issues or outages are related to large attachment uploads

  • Identify large attachments for housekeeping

  • Identify users/teams who are using a lot of disk space

  • Identify the usage pattern to justify for increase in disk storage as well as archiving strategy


Useful info

If a file is greater than 100 MB, the content will not be indexed.

When a file is uploaded, Confluence will attempt to extract and index its text. 
This process is quite memory intensive and can cause out of memory errors when very large files are uploaded.
Confluence has a number of safeguards to prevent this happening:

  • If the uploaded file is larger than 100 MB:

    • Confluence will not attempt to extract text or index the file contents.

    • Only the filename will be searchable.

  • If the uploaded file is one of the following types, Confluence will only extract up to:

    • 1 MB of text from Excel (.xlsx) and PowerPoint (.pptx)

    • 8 MB of text from PDF (.pdf)

    • 10 MB of text from other text files (including .txt, .xml, .html, .rtf etc)

    • 16 MB of text from Word (.docx)


References