Mattermost – Export channels with attachments

This week I had a co-worker asking me how we can export channels with their posts and attachments. First thing I thought was the bulkimport tool we provide with the CLI but this can’t export file attachments. So I checked the API and figured out that posts have FileIds in case they have files attached. To download these files you could manually check for the ID and search the storage or you could use the following script that is hosted on Github: https://github.com/cjohannsen81/mattermost-attachment-exporter

Checking the first section of the script you might have to change some things:

import requests
import json

url = "http://3.80.208.135:8065"
auth_token = ""
login_url = url+"/api/v4/users/login"

payload = { "login_id": "admin@mattermost.com",
            "password": "MattermostDemo,1"}
headers = {"content-type": "application/json"}
s = requests.Session()
r = s.post(login_url, data=json.dumps(payload), headers=headers)
auth_token = r.headers.get("Token")
hed = {'Authorization': 'Bearer ' + auth_token}

#team_name= raw_input("Please enter the team name (lowercase, no blanks): ")
team_name = "demoteam"
#channel_name = raw_input("Please enter the channel name to check (lowercase, no blanks): ")
channel_name = "testexport"

files = []

The url should be set to you Mattermost server as well as the login_id and the password in the payload. Second you should decide if you want to ask for team and channel names or if you just want to set them static. The auth_token and files variables will be filled automatically and are just initialized here. This section just sets the authentication and some basic variables.

The second part will do the work and gather the data.

def get_team_id():
    team_url = url+"/api/v4/teams/search"
    payload = { "term": team_name}
    response = requests.post(team_url, headers=hed, json=payload)
    info = response.json()
    print("Found team" + info[0]["name"])
    team_id = info[0]["id"]
    get_channel_id(team_id)

def get_channel_id(team_id):
    team_url = url+"/api/v4/teams/"+team_id+"/channels/search"
    payload = { "term": channel_name}
    response = requests.post(team_url, headers=hed, json=payload)
    info = response.json()
    print("Found channel" + info[0]["name"])
    channel_id = info[0]["id"]
    get_posts(channel_id)

def get_posts(channel_id):
    team_url = url+"/api/v4/channels/"+channel_id+"/posts"
    response = requests.get(team_url, headers=hed)
    info = response.json()
    for k, v in info['posts'].items():
        for k, v in v.items():
            if k == "file_ids":
                for fileid in v:
                    files.append(fileid)

def get_uploads(files):
    for id in files:
        info_url = url+"/api/v4/files/" + id + '/info'
        response = requests.get(info_url, headers=hed)
        info = response.json()
        filename = info["name"]

        file_url = url+"/api/v4/files/" + id
        response = requests.get(file_url, headers=hed)
        open(filename, 'wb').write(response.content)

get_team_id()
get_uploads(files)

As you can see the different functions just call each other which could be done much nicer and more efficient (had no time :)). Because the files have to be identified by their FileId I need to search the channel in the team and the posts in the channel and then iterate over the posts to find the attachments.

The script can easily be called using

 

python get_channel_posts_and_files.py

 

and should start downloading all the attachments with their original filename.

 

Hope this helps!

 

 

Leave a Reply