11/19/2015

Find and Replace multiline text String in HTML Files using Powershell (No Regular Expression)

Sometimes we have a bunch of text files or HTML files and all we wanted to do with them is quickly find and replace some HTML content with new HTML content, Find and Replace also works great in normal Code Editors, So that You can perform, find and replace operation.
But, Here we are talking about a bunch of files, may be 10000 or more ...
Yeah, In this situation we need to write a program to do this job, OK, Why not use PowerShell, it can do this within a few lines of code.


# Steps
  1. Generate a List of files
  2. What You are looking for ( $oldstring )
  3. Replace that with this
  4. Loop through Each and every file and write processed content to either New Location (recommended) or Overwrite existing content


 # Window PowerShell Code

# Generating a List of Files
$filelist = ls -Filter *.html -Recurse

#case - 1 
# What You are looking for ... 
$old = "<script>
    var _gaq = _gaq || [];
    _gaq.push(['_setAccount', 'UA-19201920192102-1']);
    _gaq.push(['_setDomainName', 'helloworld.com']);
    _gaq.push(['_setAllowLinker', true]);
    _gaq.push(['_trackPageview']);

    (function() {
    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
    })();
</script>"
# Now find and replace all NewLine Characters with a Space Character 
# Note: This will make you code UGLY, but You can make beautify again some tools like (Tidy)
# Or You can do this Simple trick...
# # Here we are replacing NEWLINE character with a space character, but You can replace NEWLINE Character with 
# # Your custom code template and When You are done, replace back Your custom code template with NEWLINE Character
# # This way You can preserve readblity of code as well...
# # Example: 
# $old = $old -replace "`r`n", " __NEWLINE__ "

$old = $old -replace "`r`n", " "
# Find $old and replace with $new
$new = " "


# Main Code to loop through each and Every File ...
$filelist.ForEach({ 
	# $old = $old -replace "`r`n", " __NEWLINE__ "
	$i = (Get-Content $_.fullname -Encoding UTF8 -Raw) -replace "`n", " "; 
	$j = $i.Replace($old, $new);
	Out-File -FilePath $_.fullname -Encoding UTF8 -InputObject $j;
})


No comments :