Regular Expression (Regex) di PHP

Pelajari Regular Expression untuk pattern matching, validasi data, dan manipulasi string yang powerful. Regex adalah tools yang sangat berguna untuk pengembang PHP.

Pengenalan Regular Expression

Regular Expression (Regex) adalah pattern yang digunakan untuk mencocokkan karakter dalam string. PHP menggunakan PCRE (Perl Compatible Regular Expressions).

Fungsi PCRE Dasar

<?php
// preg_match() - mencari pattern (return 1 jika ditemukan)
$text = "Hello World";
$pattern = "/World/";

if (preg_match($pattern, $text)) {
    echo "Pattern ditemukan!";
}

// preg_match_all() - mencari semua pattern
$text = "PHP is great. PHP is powerful.";
$pattern = "/PHP/";
$matches = [];

$count = preg_match_all($pattern, $text, $matches);
echo "Ditemukan $count kali: ";
print_r($matches[0]);

// preg_replace() - mengganti pattern
$text = "Hello World";
$pattern = "/World/";
$replacement = "PHP";
$result = preg_replace($pattern, $replacement, $text);
echo $result; // "Hello PHP"
?>

Karakter Khusus dan Metacharacters

Karakter Deskripsi Contoh
. Karakter apapun kecuali newline /a.c/ cocok dengan "abc", "a1c"
* 0 atau lebih karakter sebelumnya /ab*c/ cocok dengan "ac", "abc", "abbc"
+ 1 atau lebih karakter sebelumnya /ab+c/ cocok dengan "abc", "abbc"
? 0 atau 1 karakter sebelumnya /ab?c/ cocok dengan "ac", "abc"
^ Awal string /^Hello/ cocok jika dimulai "Hello"
$ Akhir string /World$/ cocok jika diakhiri "World"
[] Character class /[abc]/ cocok dengan "a", "b", atau "c"
() Grouping /(ab)+/ cocok dengan "ab", "abab"

Contoh Dasar

<?php
$text = "PHP adalah bahasa pemrograman yang powerful";

// Case insensitive dengan modifier 'i'
if (preg_match("/php/i", $text)) {
    echo "Ditemukan PHP (case insensitive)";
}

// Mencari angka
$text2 = "Saya berumur 25 tahun";
if (preg_match("/\d+/", $text2, $matches)) {
    echo "Umur: " . $matches[0]; // 25
}

// Mencari email
$email = "user@example.com";
if (preg_match("/\w+@\w+\.\w+/", $email)) {
    echo "Format email valid";
}
?>
Tips: Regex dimulai dan diakhiri dengan delimiter (biasanya /). Modifier seperti i (case insensitive) ditambahkan setelah delimiter penutup.

Pattern Matching

Pattern matching adalah inti dari regex. Mari pelajari berbagai teknik untuk membuat pattern yang tepat.

Character Classes

<?php
$text = "Hello123 World!";

// Karakter alphanumeric
preg_match_all("/[a-zA-Z]/", $text, $letters);
echo "Huruf: " . implode("", $letters[0]); // HelloWorld

// Angka
preg_match_all("/[0-9]/", $text, $numbers);
echo "Angka: " . implode("", $numbers[0]); // 123

// Karakter khusus
preg_match_all("/[^a-zA-Z0-9\s]/", $text, $special);
echo "Karakter khusus: " . implode("", $special[0]); // !

// Predefined character classes
preg_match_all("/\w/", $text, $words); // Word characters
preg_match_all("/\d/", $text, $digits); // Digits
preg_match_all("/\s/", $text, $spaces); // Whitespace
?>

Quantifiers

<?php
$text = "aaabbbcccc";

// Exact quantifier
preg_match_all("/a{3}/", $text, $matches);
print_r($matches[0]); // ["aaa"]

// Range quantifier
preg_match_all("/b{2,3}/", $text, $matches);
print_r($matches[0]); // ["bbb"]

// Minimum quantifier
preg_match_all("/c{2,}/", $text, $matches);
print_r($matches[0]); // ["cccc"]

// Phone number pattern
$phone = "081234567890";
if (preg_match("/^08\d{8,10}$/", $phone)) {
    echo "Nomor HP valid";
}
?>

Grouping dan Capturing

<?php
$text = "Tanggal: 15-01-2024";

// Capturing groups
$pattern = "/(\d{2})-(\d{2})-(\d{4})/";
if (preg_match($pattern, $text, $matches)) {
    echo "Tanggal: " . $matches[1]; // 15
    echo "Bulan: " . $matches[2]; // 01
    echo "Tahun: " . $matches[3]; // 2024
    echo "Full match: " . $matches[0]; // 15-01-2024
}

// Named capturing groups
$pattern = "/(?P<day>\d{2})-(?P<month>\d{2})-(?P<year>\d{4})/";
if (preg_match($pattern, $text, $matches)) {
    echo "Tanggal: " . $matches['day'];
    echo "Bulan: " . $matches['month'];
    echo "Tahun: " . $matches['year'];
}

// Non-capturing groups
$pattern = "/(?:Mr|Mrs|Ms)\.\s+(\w+)/";
$text = "Mr. Johnson";
if (preg_match($pattern, $text, $matches)) {
    echo "Nama: " . $matches[1]; // Johnson
}
?>

Lookahead dan Lookbehind

<?php
// Positive lookahead (?=...)
$text = "password123";
$pattern = "/^(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,}$/";
// Password harus mengandung angka, huruf kecil, huruf besar, min 8 karakter

// Negative lookahead (?!...)
$text = "admin";
$pattern = "/^(?!admin|root|guest).+$/";
// Username tidak boleh admin, root, atau guest

// Positive lookbehind (?<=...)
$text = "USD 100";
$pattern = "/(?<=USD\s)\d+/";
if (preg_match($pattern, $text, $matches)) {
    echo "Jumlah: " . $matches[0]; // 100
}

// Negative lookbehind (?<!...)
$text = "user@domain.com";
$pattern = "/(?<!admin)@domain\.com/";
// Match @domain.com yang tidak didahului oleh admin
?>
Pro Tip: Gunakan online regex tester untuk menguji pattern Anda sebelum implementasi di kode.

Validasi Data dengan Regex

Regex sangat berguna untuk validasi input user seperti email, nomor telepon, password, dan format data lainnya.

Validasi Email

<?php
function validateEmail($email) {
    // Basic email validation
    $pattern = "/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/";
    return preg_match($pattern, $email);
}

// More comprehensive email validation
function validateEmailAdvanced($email) {
    $pattern = "/^[a-zA-Z0-9]([a-zA-Z0-9._-])*[a-zA-Z0-9]@[a-zA-Z0-9]([a-zA-Z0-9.-])*[a-zA-Z0-9]\.[a-zA-Z]{2,}$/";
    return preg_match($pattern, $email);
}

// Test
$emails = [
    "user@example.com",      // valid
    "user.name@domain.co.id", // valid
    "invalid.email",         // invalid
    "@domain.com",           // invalid
    "user@"                  // invalid
];

foreach ($emails as $email) {
    echo "$email: " . (validateEmail($email) ? "Valid" : "Invalid") . "<br>";
}
?>

Validasi Password

<?php
function validatePassword($password) {
    $rules = [
        'length' => '/^.{8,}$/',                    // Min 8 karakter
        'lowercase' => '/[a-z]/',                   // Huruf kecil
        'uppercase' => '/[A-Z]/',                   // Huruf besar
        'number' => '/\d/',                         // Angka
        'special' => '/[!@#$%^&*(),.?":{}|<>]/'   // Karakter khusus
    ];
    
    $errors = [];
    foreach ($rules as $rule => $pattern) {
        if (!preg_match($pattern, $password)) {
            $errors[] = $rule;
        }
    }
    
    return empty($errors) ? true : $errors;
}

// Advanced password validation
function validatePasswordStrength($password) {
    // Must contain at least one lowercase, one uppercase, one digit, one special char
    // And be at least 8 characters long
    $pattern = '/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*(),.?":{}|<>]).{8,}$/';
    return preg_match($pattern, $password);
}

// Test
$passwords = ["password", "Password1", "Password1!", "Pass1!"];
foreach ($passwords as $pwd) {
    $result = validatePassword($pwd);
    echo "$pwd: " . (is_array($result) ? "Invalid (" . implode(", ", $result) . ")" : "Valid") . "<br>";
}
?>

Validasi Nomor Telepon Indonesia

<?php
function validatePhoneNumber($phone) {
    // Remove spaces, dashes, and parentheses
    $clean = preg_replace('/[\s\-\(\)]/', '', $phone);
    
    $patterns = [
        '/^08\d{8,10}$/',           // 08xxxxxxxxx (mobile)
        '/^\+628\d{8,10}$/',        // +628xxxxxxxxx (mobile with country code)
        '/^62\d{9,12}$/',           // 62xxxxxxxxx (with country code)
        '/^0\d{1,2}\d{6,8}$/'       // 0xxx-xxxxxxx (landline)
    ];
    
    foreach ($patterns as $pattern) {
        if (preg_match($pattern, $clean)) {
            return true;
        }
    }
    
    return false;
}

// Format phone number
function formatPhoneNumber($phone) {
    $clean = preg_replace('/[\s\-\(\)]/', '', $phone);
    
    // Convert to +62 format
    if (preg_match('/^08(\d+)$/', $clean, $matches)) {
        return '+628' . $matches[1];
    }
    
    return $clean;
}

// Test
$phones = [
    "081234567890",
    "+628123456789",
    "021-12345678",
    "0274-123456",
    "invalid-phone"
];

foreach ($phones as $phone) {
    echo "$phone: " . (validatePhoneNumber($phone) ? "Valid" : "Invalid");
    if (validatePhoneNumber($phone)) {
        echo " (" . formatPhoneNumber($phone) . ")";
    }
    echo "<br>";
}
?>

Validasi Format Lainnya

<?php
// URL validation
function validateURL($url) {
    $pattern = '/^https?:\/\/(?:[-\w.])+(?:\:[0-9]+)?(?:\/(?:[\w\/_.])*(?:\?(?:[\w&%=.])*)?(?:\#(?:[\w.])*)?)?$/';
    return preg_match($pattern, $url);
}

// IP Address validation
function validateIP($ip) {
    $pattern = '/^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/';
    return preg_match($pattern, $ip);
}

// Credit card number (basic format)
function validateCreditCard($number) {
    $clean = preg_replace('/\s+/', '', $number);
    $patterns = [
        'visa' => '/^4\d{12}(?:\d{3})?$/',
        'mastercard' => '/^5[1-5]\d{14}$/',
        'amex' => '/^3[47]\d{13}$/'
    ];
    
    foreach ($patterns as $type => $pattern) {
        if (preg_match($pattern, $clean)) {
            return $type;
        }
    }
    
    return false;
}

// Indonesian KTP number
function validateKTP($ktp) {
    // KTP: 16 digits
    $pattern = '/^\d{16}$/';
    return preg_match($pattern, $ktp);
}

// Test
echo validateURL("https://www.example.com") ? "URL valid" : "URL invalid";
echo "<br>";
echo validateIP("192.168.1.1") ? "IP valid" : "IP invalid";
?>
Perhatian: Regex untuk validasi email yang sempurna sangat kompleks. Untuk production, pertimbangkan menggunakan filter_var() atau library validasi khusus.

Advanced Regex Techniques

Teknik regex lanjutan untuk kasus-kasus kompleks dan optimasi performance.

String Manipulation

<?php
// preg_split() - memisahkan string berdasarkan pattern
$text = "apple,banana;orange:grape";
$fruits = preg_split('/[,;:]/', $text);
print_r($fruits); // ["apple", "banana", "orange", "grape"]

// preg_replace_callback() - replace dengan callback function
$text = "I have 5 apples and 10 oranges";
$result = preg_replace_callback('/\d+/', function($matches) {
    return $matches[0] * 2; // Double the numbers
}, $text);
echo $result; // "I have 10 apples and 20 oranges"

// preg_replace dengan limit
$text = "PHP is great. PHP is awesome. PHP rocks.";
$result = preg_replace('/PHP/', 'JavaScript', $text, 2); // Replace only first 2 occurrences
echo $result; // "JavaScript is great. JavaScript is awesome. PHP rocks."
?>

Parsing dan Extraction

<?php
// Extract all URLs from text
function extractURLs($text) {
    $pattern = '/https?:\/\/(?:[-\w.])+(?:\:[0-9]+)?(?:\/(?:[\w\/_.])*(?:\?(?:[\w&%=.])*)?(?:\#(?:[\w.])*)?)?/';
    preg_match_all($pattern, $text, $matches);
    return $matches[0];
}

// Extract email addresses
function extractEmails($text) {
    $pattern = '/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/';
    preg_match_all($pattern, $text, $matches);
    return $matches[0];
}

// Parse CSV-like data
function parseCSV($line) {
    $pattern = '/(?:^|,)("(?:[^"]+|"")*"|[^",]*)/';
    preg_match_all($pattern, $line, $matches);
    $data = array_map(function($field) {
        return trim($field, ',');
    }, $matches[1]);
    return $data;
}

// Extract hashtags from social media text
function extractHashtags($text) {
    $pattern = '/#([a-zA-Z0-9_]+)/';
    preg_match_all($pattern, $text, $matches);
    return $matches[1];
}

// Test
$text = "Visit https://example.com or email us at info@example.com #awesome #php";
echo "URLs: " . implode(", ", extractURLs($text)) . "<br>";
echo "Emails: " . implode(", ", extractEmails($text)) . "<br>";
echo "Hashtags: " . implode(", ", extractHashtags($text)) . "<br>";
?>

Performance Optimization

<?php
// Use atomic groups for better performance
function optimizedPattern() {
    // Bad: catastrophic backtracking
    $bad = '/^(a+)+b$/';
    
    // Good: atomic group prevents backtracking
    $good = '/^(?>a+)+b$/';
    
    return $good;
}

// Compile regex for reuse
class RegexValidator {
    private static $patterns = [];
    
    public static function getPattern($name) {
        if (!isset(self::$patterns[$name])) {
            switch ($name) {
                case 'email':
                    self::$patterns[$name] = '/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/';
                    break;
                case 'phone':
                    self::$patterns[$name] = '/^08\d{8,10}$/';
                    break;
            }
        }
        return self::$patterns[$name];
    }
    
    public static function validate($type, $value) {
        $pattern = self::getPattern($type);
        return preg_match($pattern, $value);
    }
}

// Benchmarking regex performance
function benchmarkRegex($pattern, $text, $iterations = 10000) {
    $start = microtime(true);
    
    for ($i = 0; $i < $iterations; $i++) {
        preg_match($pattern, $text);
    }
    
    $end = microtime(true);
    return ($end - $start) * 1000; // milliseconds
}

// Test performance
$pattern1 = '/^(a+)+b$/';      // Problematic pattern
$pattern2 = '/^(?>a+)+b$/';   // Optimized pattern
$text = str_repeat('a', 20);

// echo "Pattern 1: " . benchmarkRegex($pattern1, $text) . " ms";
echo "Pattern 2: " . benchmarkRegex($pattern2, $text) . " ms";
?>

Custom Regex Class

<?php
class RegexHelper {
    private $lastError = null;
    
    public function match($pattern, $subject, &$matches = null) {
        $result = @preg_match($pattern, $subject, $matches);
        $this->checkError();
        return $result;
    }
    
    public function matchAll($pattern, $subject, &$matches = null) {
        $result = @preg_match_all($pattern, $subject, $matches);
        $this->checkError();
        return $result;
    }
    
    public function replace($pattern, $replacement, $subject, $limit = -1, &$count = null) {
        $result = @preg_replace($pattern, $replacement, $subject, $limit, $count);
        $this->checkError();
        return $result;
    }
    
    public function split($pattern, $subject, $limit = -1, $flags = 0) {
        $result = @preg_split($pattern, $subject, $limit, $flags);
        $this->checkError();
        return $result;
    }
    
    public function escape($string) {
        return preg_quote($string, '/');
    }
    
    private function checkError() {
        $error = preg_last_error();
        if ($error !== PREG_NO_ERROR) {
            $errors = [
                PREG_INTERNAL_ERROR => 'Internal error',
                PREG_BACKTRACK_LIMIT_ERROR => 'Backtrack limit exceeded',
                PREG_RECURSION_LIMIT_ERROR => 'Recursion limit exceeded',
                PREG_BAD_UTF8_ERROR => 'Invalid UTF-8',
                PREG_BAD_UTF8_OFFSET_ERROR => 'Invalid UTF-8 offset'
            ];
            $this->lastError = $errors[$error] ?? 'Unknown error';
        }
    }
    
    public function getLastError() {
        return $this->lastError;
    }
}

// Usage
$regex = new RegexHelper();
$result = $regex->match('/\d+/', 'I have 5 apples', $matches);
if ($result) {
    echo "Found: " . $matches[0];
} else {
    echo "Error: " . ($regex->getLastError() ?? 'No match');
}
?>
Warning: Regex yang kompleks dapat menyebabkan catastrophic backtracking dan performance issues. Selalu test dengan data yang besar dan pertimbangkan alternatif non-regex untuk parsing kompleks.
Tutorial Saat Ini
Level: Menengah

Memerlukan pemahaman dasar PHP

Daftar Isi
Tips Belajar