diceline-chartmagnifiermouse-upquestion-marktwitter-whiteTwitter_Logo_Blue

Today I Learned

Export Channel Messages from Teams for Slack CSV Import

  1. Create an entra App in App Registrations.
  2. Go to API permissions and give all the necesary permissions via Microsoft Graph. Some permissions might be superfluous. You'll figure it out. :)

Permission Name                   Type
-------------------------------  ---------
Channel.ReadBasic.All            Application
ChannelMember.Read.All           Delegated
ChannelMessage.Read.All          Delegated
ChannelMessage.Read.All          Application
ChannelSettings.Read.All         Application
Chat.Read                        Delegated
Chat.ReadBasic                   Delegated
ChatMember.Read                  Delegated
ChatMessage.Read                 Delegated
Directory.Read.All               Application
Group.Read.All                   Delegated
Group.Read.All                   Application
Team.ReadBasic.All               Delegated
Team.ReadBasic.All               Application
TeamSettings.Read.All            Delegated
TeamSettings.Read.All            Application
User.Read                        Delegated
  1. Go to Certificates and Secrets and create a new secret.

  2. Go to OneDrive and download your channel's files. You'll find them in their corresponding folder.

  3. Upload your files to a web accessible storage (web server).

  4. Complete the variables in the CONFIG section below

  5. Create a virtual environment, activate it and install the required packages:

E.g.

python3 -m venv .venv
source .venv/bin/activate
pip install requests python-dateutil beautifulsoup4

  1. Execute the python file. You should get a list of teams, and then a list of channels. You'll get a JSON and a CSV file. Use the CSV file to import your data in Slack (https://{your_company_name}.slack.com/services/import).

export.py

import requests
import json
import csv
import os
from dateutil import parser
from bs4 import BeautifulSoup
from urllib.parse import quote

# === CONFIG ===
CLIENT_ID = ""
CLIENT_SECRET = ""
TENANT_ID = ""
ATTACHMENT_BASE_URL = "https://example.com/attachments/"  # Replace with your actual base URL for attachments

def get_access_token():
    url = f"https://login.microsoftonline.com/{TENANT_ID}/oauth2/v2.0/token"
    data = {
        "client_id": CLIENT_ID,
        "client_secret": CLIENT_SECRET,
        "scope": "https://graph.microsoft.com/.default",
        "grant_type": "client_credentials"
    }
    r = requests.post(url, data=data)
    r.raise_for_status()
    return r.json()["access_token"]

def graph_get(url, token):
    headers = { "Authorization": f"Bearer {token}" }
    r = requests.get(url, headers=headers)
    r.raise_for_status()
    return r.json()

def select_from_list(items, label_key):
    for i, item in enumerate(items):
        print(f"{i+1}. {item[label_key]}")
    choice = int(input("Select number: ")) - 1
    return items[choice]

def fetch_all_messages(access_token, team_id, channel_id):
    headers = { "Authorization": f"Bearer {access_token}" }
    url = f"https://graph.microsoft.com/v1.0/teams/{team_id}/channels/{channel_id}/messages"
    all_messages = []

    while url:
        response = requests.get(url, headers=headers)
        data = response.json()
        all_messages.extend(data.get("value", []))
        url = data.get("@odata.nextLink")

    return all_messages

def build_user_email_map(user_ids, access_token):
    headers = {"Authorization": f"Bearer {access_token}"}
    email_map = {}

    for uid in user_ids:
        url = f"https://graph.microsoft.com/v1.0/users/{uid}"
        try:
            r = requests.get(url, headers=headers)
            if r.status_code == 404:
                continue  # silently ignore missing users
            r.raise_for_status()
            data = r.json()
            email = data.get("userPrincipalName")
            if email:
                email_map[uid] = email
        except Exception as e:
            print(f"Skipped user {uid}: {e}")
            continue

    return email_map

def clean_html(content):
    return BeautifulSoup(content or "", "html.parser").get_text()

def convert_to_slack_format(messages, email_map):
    slack_messages = []
    id_to_ts = {}

    for msg in messages:
        reply_to = msg.get("replyToId")
        is_reply = reply_to is not None

        from_field = msg.get("from") or {}
        user_info = from_field.get("user") or {}
        user_id = user_info.get("id")
        user = email_map.get(user_id) or user_info.get("displayName") or from_field.get("application", {}).get("displayName") or "unknown"

        text = clean_html(msg.get("body", {}).get("content", "")).strip()

        # Generate encoded attachment URLs
        attachment_links = []

        for att in msg.get("attachments", []):
            name = att.get("name")
            if name:
                encoded_name = quote(name)
                attachment_links.append(f"{ATTACHMENT_BASE_URL}{encoded_name}")

        for content in msg.get("hostedContents", []):
            content_type = content.get("contentType", "file")
            fallback_name = f"{msg['id']}_{content_type.replace('/', '_')}"
            encoded_fallback = quote(fallback_name)
            attachment_links.append(f"{ATTACHMENT_BASE_URL}{encoded_fallback}")

        full_text = "\n".join(filter(None, [text] + attachment_links))
        if not full_text.strip():
            continue

        timestamp = parser.parse(msg["createdDateTime"]).timestamp()
        ts_string = f"{timestamp:.6f}"

        slack_msg = {
            "type": "message",
            "user": user,
            "text": full_text,
            "ts": ts_string
        }

        if is_reply:
            parent_ts = id_to_ts.get(reply_to)
            slack_msg["thread_ts"] = parent_ts or ts_string

        slack_messages.append(slack_msg)

        if not is_reply:
            id_to_ts[msg["id"]] = ts_string

    return slack_messages

def save_slack_json(slack_messages, filename="general.json"):
    with open(filename, "w", encoding="utf-8") as f:
        json.dump(slack_messages, f, indent=2)

def save_csv(slack_messages, channel_name, filename="slack_messages.csv"):
    with open(filename, "w", newline="", encoding="utf-8") as f:
        writer = csv.writer(f, quoting=csv.QUOTE_ALL)
        writer.writerow(["timestamp", "channel", "username", "text"])

        for msg in sorted(slack_messages, key=lambda x: float(x["ts"])):
            ts = int(float(msg["ts"]))
            channel = channel_name
            username = msg["user"]
            text = msg["text"].replace("\r", "")
            writer.writerow([ts, channel, username, text])

def main():
    token = get_access_token()

    teams = graph_get("https://graph.microsoft.com/v1.0/groups?$filter=resourceProvisioningOptions/Any(x:x eq 'Team')", token)["value"]
    team = select_from_list(teams, "displayName")

    team_id = team["id"]
    channels = graph_get(f"https://graph.microsoft.com/v1.0/teams/{team_id}/channels", token)["value"]
    channel = select_from_list(channels, "displayName")

    messages = fetch_all_messages(token, team_id, channel["id"])

    user_ids = set()
    for msg in messages:
        from_field = msg.get("from")
        if isinstance(from_field, dict):
            user_info = from_field.get("user")
            if isinstance(user_info, dict):
                user_id = user_info.get("id")
                if user_id:
                    user_ids.add(user_id)
    email_map = build_user_email_map(user_ids, token)

    slack_messages = convert_to_slack_format(messages, email_map)

    json_file = f"{channel['displayName'].replace(' ', '_').lower()}.json"
    csv_file = f"{channel['displayName'].replace(' ', '_').lower()}.csv"

    save_slack_json(slack_messages, json_file)
    print(f"Saved to {json_file}")

    save_csv(slack_messages, channel["displayName"], csv_file)
    print(f"Saved to {csv_file}")

if __name__ == "__main__":
    main()

TIL How to Batch Compress PDF Files Using Ghostscript

brew install ghostscript
#!/bin/bash
[ $# -lt 3 ] && { echo "Usage: $0 /input_dir /output_dir /quality"; exit 1; }
input_dir="$1"; output_dir="$2"; quality="$3"
mkdir -p "$output_dir"
for file in "$input_dir"/*.pdf; do
  base=$(basename "$file" .pdf)
  gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS="$quality" -dNOPAUSE -dQUIET -dBATCH -sOutputFile="$output_dir/${base}_compressed.pdf" "$file"
done
./compress_pdfs.sh /path/to/input /path/to/output /quality_setting

Quality options: /screen, /ebook, /printer, /prepress.

Say goodbye to online pdf converters.

TIL: How to Quickly Check Which AWS Regions Support SES Using a Bash Script

Although it has improved lately for some services, AWS is still notoriously unfriendly when it comes to checking which services are activated and running across different regions. Here's a script you can run in the console to quickly see which regions have SES (Simple Email Service) available:

#!/bin/bash
for region in $(aws ec2 describe-regions --query "Regions[].RegionName" --output text); do
    echo "Checking SES in region: $region"
    if output=$(aws ses get-send-quota --region $region 2>&1); then
        echo "SES is active in region: $region"
        echo "$output"
    else
        echo "SES is not available in region: $region"
    fi
    echo "---------------------------------------"
done

This script loops through all AWS regions and checks if SES is running in each one, giving you a quick and easy overview of SES availability.

The output:

Checking SES in region: us-east-1
SES is active in region: us-east-1
{
    "Max24HourSend": 50000.0,
    "MaxSendRate": 14.0,
    "SentLast24Hours": 0.0
}
---------------------------------------
Checking SES in region: us-west-1
SES is not available in region: us-west-1
---------------------------------------
Checking SES in region: eu-west-1
SES is active in region: eu-west-1
{
    "Max24HourSend": 50000.0,
    "MaxSendRate": 10.0,
    "SentLast24Hours": 0.0
}
---------------------------------------

Fastest way to delete a folder structure in Linux

The problem

Deleting a huge folder structure is a pain and can take forever. However if you want to take forever divided by 3, you can use rsync.

It seems rsync is 3 time faster when deleting a huge folder structure than rm or find because it doesn't read the files before deleting.

The trick

  • Create an empty folder in /tmp/empty_folder
  • Run the command:

rsync -aP --delete ./empty_folder/ ./target_folder/


Enjoy!

MUI Tooltip overflow problem

If you're encountering an inexplicable overflow issue with the Tooltip component in Material-UI on hover, here's a solution that might help.

First, make sure you have popper.js installed using the following command:

npm install --save-dev @types/popper.js

Next, in the component containing Tooltip, add the following imports:

import { OptionsGeneric } from '@popperjs/core';

Then, update your Tooltip usage as follows:

<Tooltip
  title={'Some text'}
  PopperProps={{
    disablePortal: true,
    popperOptions: {
      modifiers: [
        {
          name: 'preventOverflow',
          options: {
            enabled: true,
            boundariesElement: 'window',
          },
        },
      ] as OptionsGeneric<any>['modifiers'],
    },
  }}
  aria-label='Some text'
>
  {/* ... your content ... */}
</Tooltip>

This configuration adds Popper options to handle overflow.

HTTP pools - Laravel HTTP client

In PHP, unlike JavaScript, we don't have native support for Promises or asynchronous programming. This limitation can become apparent when dealing with multiple HTTP requests that need to be made simultaneously. However, the Http::pool method, as shown in the code snippet, offers a workaround by allowing us to leverage parallel processing.

The code uses the HTTP pools functionality provided by the Laravel HTTP client, which is a convenient way to execute multiple HTTP requests concurrently. Here's an expanded explanation of the code:

//dummy code

$responses = Http::pool(fn (Pool $pool) => [
    $pool->get("https://jsonplaceholder.typicode.com/todos/1"),
    $pool->get("https://jsonplaceholder.typicode.com/users/1"),
    $pool->get("https://jsonplaceholder.typicode.com/posts/1")
]);

$todoInfo = $responses[0]->json();
$userInfo = $responses[1]->json();
$postInfo = $responses[2]->json();

This code snippet demonstrates how to make three (or more) simultaneous HTTP calls to different endpoints using HTTP pools in PHP. In a specific situation on the project I'm working on, this approach significantly reduced the load time from 3 seconds to 1.1 seconds.

Transfer TB of files in the background via rsync and tmux on linux

Problem

Transfering files in the background via rsync is possible via nohup command and some other workarounds.

These don't really work when your transfer takes days (>50TB of data) because the background job might get killed by the system.

In order to create a persistent session we need to use tmux.

Solution

Authentication

Create a ssh key and upload it to the server you want to transfer from.

ssh-keygen -t ed25519
ssh-copy-id -i ~/.ssh/id_rsa.pub [your_old_server.domain.com]

Create a new session and run rsync

tmux new -s [your_session_name]

rsync --avhPW --stats [your_old_server.domain.com]:[/source/path/] [/destination/path]

Detach from session

Press CTRL+B and then D. You session will detach and run in background.

Re-attach session

To get back into your session just:

tmux a -t [your_session_name]

That's it.

How to dynamically infer TypeScript single object types from tRPC query outputs

This is how you can dynamically infer a single object type from a tRPC Query returning an array of elements of that specific type:

export const exampleRouter = router({
  exampleQuery: exampleQueryImplementation,
});

// extracting a single object type from an array of that type 
export type ArrayElement<ArrayType extends unknown[] | null> =
  ArrayType extends (infer ElementType)[] ? ElementType : never;

// the output types of a specific router, indexable by query identifiers
type RouterOutput = inferRouterOutputs<typeof exampleRouter>;

// return type of a single query
export type QueryOutputArray = RouterOutput['exampleQuery'];

// the type of each element from the returned array
export type QueryOutputObject = ArrayElement<QueryOutputArray>;

How To Change The Slug of A Custom Post Type in WordPress

To change the slug for a post type in WordPress, you can use the register_post_type() function. Within the arguments array for the function, you can set the rewrite parameter to an array with the new slug you want to use.

Here's an example code snippet:

function change_post_type_slug() {
    $args = array(
        'rewrite' => array( 'slug' => 'new-slug' ),
        // other post type arguments
    );
    register_post_type( 'your_post_type', $args );
}
add_action( 'init', 'change_post_type_slug' );

In this example, replace your_post_type with the name of the post type you want to change, and replace new-slug with the new slug you want to use.

Once you've added this code to your functions.php file, you'll need to go to Settings > Permalinks in the WordPress admin dashboard and click the "Save Changes" button to update your permalinks. This will ensure that your new slug is properly applied.