User login

Replacing text that does NOT match a given string

Agaric likes to ask the hard questions...

ereg_replace beginning is not found
ereg_replace not present
ereg_replace not equal
ereg_replace character does not match

This is what Agaric came up with. We think that's what the ^ (carrot) character does in that context: negate, deny, exclude the following characters.

Using the . to also match on shttp, or any letter http if they've got others. Oh man, that's https and the . doesn't work that way anyway. Aborting that part of this.

From our scraper module, no, not right:

  // point image paths to original
  $output = ereg_replace('<img src="/', '<img src="' . $domain . $output);

  // point relative image paths to original, attempt to avoid mangling full paths
  $output = ereg_replace('<img src="[^http(.)?://]', '<img src="' . $domain . $path, $output);

  // point URLs to original
  $output = ereg_replace('<a href="', '<a href="/' . $path, $output);
 
  // point relative URLs to original, attempt to avoid mangling full URLs
  $output = ereg_replace('<a href="', '<a href="[^.http://]' . $domain . $path, $output);

Currently:

 
  // point image paths to original
  $output = ereg_replace('<img src="/', '<img src="' . $domain . $output);

  // point relative image paths to original, attempt to avoid mangling full paths
  $output = ereg_replace('<img src="[^https?://]', '<img src="' . $domain . $path, $output);

  // point URLs to original
  $output = ereg_replace('<a href="', '<a href="/' . $path, $output);
 
  // point relative URLs to original, attempt to avoid mangling full URLs
  $output = ereg_replace('<a href="', '<a href="[^https?]' . $domain . $path, $output);

I think preg_replace (which is reportedly faster) works the same.

Agaric likes to ask the hard questions...

ereg_replace beginning is not found
ereg_replace not present
ereg_replace not equal
ereg_replace character does not match

This is what Agaric came up with. We think that's what the ^ (carrot) character does in that context: negate, deny, exclude the following characters.

Using the . to also match on shttp, or any letter http if they've got others. Oh man, that's https and the . doesn't work that way anyway. Aborting that part of this.

From our scraper module, no, not right:

  // point image paths to original
  $output = ereg_replace('<img src="/', '<img src="' . $domain . $output);

  // point relative image paths to original, attempt to avoid mangling full paths
  $output = ereg_replace('<img src="[^http(.)?://]', '<img src="' . $domain . $path, $output);

  // point URLs to original
  $output = ereg_replace('<a href="', '<a href="/' . $path, $output);
 
  // point relative URLs to original, attempt to avoid mangling full URLs
  $output = ereg_replace('<a href="', '<a href="[^.http://]' . $domain . $path, $output);

Currently:

 
  // point image paths to original
  $output = ereg_replace('<img src="/', '<img src="' . $domain . $output);

  // point relative image paths to original, attempt to avoid mangling full paths
  $output = ereg_replace('<img src="[^https?://]', '<img src="' . $domain . $path, $output);

  // point URLs to original
  $output = ereg_replace('<a href="', '<a href="/' . $path, $output);
 
  // point relative URLs to original, attempt to avoid mangling full URLs
  $output = ereg_replace('<a href="', '<a href="[^https?]' . $domain . $path, $output);

I think preg_replace (which is reportedly faster) works the same.

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.
  • You can use Markdown syntax to format and style the text. Also see Markdown Extra for tables, footnotes, and more.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <img> <blockquote> <small> <h2> <h3> <h4> <h5> <h6> <sub> <sup> <p> <br> <strike> <table> <tr> <td> <thead> <th> <tbody> <tt> <output>
  • Lines and paragraphs break automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.