Get your MySQL table sizes in MB via a SQL query

Figuring out what tables are making your database excessively large can be really helpful when working with production database exports and keeping various environments up to date. This SQL query outputs that information for all databases on a given server:

SELECT
table_schema as `Database`,
table_name AS `Table`,
round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB`
FROM information_schema.TABLES
ORDER BY table_schema, (data_length + index_length) DESC;

Reference:

Generate MySQL INSERT statements for a few existing records

Occasionally, you’ll want to duplicate a couple of records from your production MySQL database for local use without dumping the entire table (which could be huge). As a solution, the mysqldump command allows you to specify a table and a WHERE query for that table, allowing you to select specific record ID #s and retrieve only those records as INSERT statements:

mysqldump -h {host} -P {port} -u {username} -p{password} {database_name} {table_name} --where="ID = 1" --no-create-info --no-create-db

Reference:

Import Environics demographics data into a SQL database via Laravel CLI command

Environics Analytics provides incredibly useful demographics, psychographics and segmentation data for Canada, but the raw data is challenging if the goal is to use it in a general SQL query: The raw data is in one file, the meanings of the columns are in a ‘metadata’ file, and the data provided doesn’t necessarily just list postal codes but other geographic areas as well.

Linked below is a Laravel CLI command that outputs a .SQL file that will handle the creation and insertion of data and definitions for you. The CLI command is invoked as follows:

php artisan environics_demographics:import {csv_demographics_definitions} {csv_demographics_data} {year_demographics_data} {dir_output_data}

Find the full script on GitHub Gist, or below:

Merge a PHP array without duplicates (array_merge_recursive_distinct)

Often enough in PHP, you’ll grab objects from a variety of sources and want to merge them into a single array of results. To merge without duplicates, add the following function to your codebase and make use of array_merge_recursive_distinct the same way you would array_merge_recursive:

// From: https://www.php.net/manual/en/function.array-merge-recursive.php#92195
if (! function_exists('array_merge_recursive_distinct')) {
    /**
     * @param array<int|string, mixed> $array1
     * @param array<int|string, mixed> $array2
     *
     * @return array<int|string, mixed>
     */
    function array_merge_recursive_distinct(array &$array1, array &$array2): array
    {
        $merged = $array1;
        foreach ($array2 as $key => &$value) {
            if (is_array($value) && isset($merged[$key]) && is_array($merged[$key])) {
                $merged[$key] = array_merge_recursive_distinct($merged[$key], $value);
            } else {
                $merged[$key] = $value;
            }
        }

        return $merged;
    }
}

Reference:

Git pull in a remote branch and accept all incoming changes

Do a git pull that overwrites local conflicts and accepts all remote changes via:

git pull -s recursive -X theirs <remoterepo or other repo>

For example, overwrite your local master branch with the remote origin‘s master branch:

git pull -s recursive -X theirs origin master

If it’s just a single file you need to overwrite, use git checkout:

git checkout --theirs path/to/file

Regular expressions: Find instances when the string does not equal a certain value

Another typical regex scenario: You’re looking through some data (like a JSON object) and are looking for when a value is actually being set. Let’s say our string contains:

"puttLength": null, 

Typically, you’d write a regular expression to capture the <em> tag by writing this:</em>

"puttLength": ((?!null).)*$

The use of ?! is the instruction to make a negative lookahead match, ensuring null does not exist before the line can be returned as a match. A terrific example / tool to test in your browser exists at https://www.regextester.com/15 .

Getting a recursive list of files sorted by last modified date in Linux

Here’s a quick way to get a recursive list of files, sorted in descending order by the last modified date, from any folder in *nix:

find . -type f -printf "%T@ %p\0" | sort -zk1nr | xargs -0 printf "%s\n"

Making scheduled posts in WordPress public so they can be scheduled / scraped on social media

The basics of scheduling in WordPress are quite simple: If you give a WordPress post a published date set in the future, it’ll remain “hidden” on your website until that date and time arrives. It’ll then appear, right on time, at the top of your list of public posts.

But here’s an interesting problem: Social media is now a major driver (maybe the driver) of traffic to digital media websites. When you schedule your post in WordPress, you’ll also naturally want to schedule that post to appear on Facebook and Twitter – but that would require the WordPress post to be public, which in its “future” post status isn’t yet.

The code snippet below can be placed in your theme’s functions.php file, and allows non-logged-in website visitors to view posts in the “future” post status as well as merely “publish”ed posts.

// Allow site visitors (people not logged in to WordPress) to view posts in the "future" post status.
function show_future_posts_pre_get_posts($query) {
    if ( is_single() && !is_admin() && $query->is_main_query() ) { 
        if ( !is_user_logged_in() ) {
            $query->set("post_status", array("publish", "future")); 
        }
    } 
}
add_filter('pre_get_posts', 'show_future_posts_pre_get_posts');

Reference:

How to drop a MS SQL Server database

Dropping a database in MS SQL Server is not a one-step process; you’ll need to do the following:

USE master;
ALTER DATABASE [databasename] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
DROP DATABASE [databasename];

Reference:

How to disable WordPress’s internal search system (while still using its search page templates)

On websites like Sportsnet.ca ( http://www.sportsnet.ca/ ) and Maclean’s ( http://www.macleans.ca/ ), the amount of content on those WordPress websites long ago exceeded the level that the built-in WordPress search system can capably handle.

Third-party search solutions such as Google CSE (Custom Search Engine) are being utilized instead, but there is a momentary slowdown before the page loads that is caused by the WordPress site executing its own internal search query before it displays the search results page. In order to remove this delay (and the unnecessary database request), add the following code to the theme’s functions.php file:

// Disable WordPress's internal search query (as much as we can), letting Google CSE handle that.
function internal_search_disable( $query ) {
    if ( !is_admin() && $query->is_main_query() ) {
        if ( $query->is_search ) {
            // Add a filter that effectively returns no results ever.
            add_filter('posts_where', 'internal_search_filter_where');
        }
    }
    return $query;
}
add_action('pre_get_posts', 'internal_search_disable');
 
// The WHERE search filter for disabling the internal search system.
function internal_search_filter_where( $where = '' ) {
    $where = " AND 0 = 1";
 
    // Once added, remove the filter to stop affecting other queries on the page.
    remove_filter('posts_where', 'internal_search_filter_where');
    return $where;
}