Storing scanned web-pages and files to SQL Server | Database Journal

Storing scanned web-pages and files to SQL Server

Feb 25, 2004
2 minute read


Learn how to explore any
Internet web page and store partial or complete information to either an HTML
file or Microsoft SQL server table.

This article provides a guide on how to explore any
Internet web page and store partial or complete information to either an HTML
file or Microsoft SQL server table. In addition, the method presented here will
aid you to scan a webpage for a particular string and then send that string and
related content as an email.

Method 1:

This method explains how to read
many web pages listed in a text file and store those web pages as HTML files.

Step1:

Create a folder C:\ScanWebPage and list all of the web
sites that you would like to scan in a text file as shown below
(C:\ScanWebPage\WebSiteList.txt).

http://www.microsoft.com

http://www.oracle.com
https://www.databasejournal.com
http://www.dell.com

Step2:

Copy and paste the code below, and save the file as
C:\ScanWebPage\ScanWebPagefromText.vbs

‘Author: MAK
‘Contact: mak_999@yahoo.com
‘Objective: To scan many websites and store it as HTML files
Set iFSO = CreateObject(“Scripting.FilesyStemObject”)
InputFile=”C:\ScanWebPage\WebSiteList.txt”
Set ifile = iFSO.OpenTextFile(inputfile)
Do until ifile.AtEndOfLine
webpage = ifile.ReadLine
Outputfile=replace(webpage,”http://”,””)
Outputfile=”c:\Scanwebpage\”+Outputfile+”.html”
‘msgbox webpage
‘msgbox outputfile
Set objXMLhttp = CreateObject(“MSXML2.XMLhttp”)
With objXMLhttp
  .open “GET”, webpage, False
	On Error Resume Next
  .send
  If Err.Number <> 0 Then
    Msgbox “XMLhttp error ” & Hex(Err.Number) & ” ” & Err.Description
  ElseIf .status <> 200 Then
    MsgBox “http error ” & CStr(.status) & ” ” & .statusText
  Else
    Set objFSO = CreateObject(“Scripting.FileSystemObject”)
    Set objTS = objFSO.CreateTextFile(outputfile, True)
    objTS.Write Replace(.responseText, vbLf, vbNewLine)
    objTS.Close
    Set objTS = Nothing
    Set objFSO = Nothing
  End If
end with
Set objXMLhttp =nothing
Loop
MsgBox “Completed writing all Web Pages!”

Step 3:

Execute the above
VB script to create one HTML file for every website listed in the
WebSiteList.txt. Upon completion, you will be prompted with the following
message box:

Three files will have been created under c:\scanwebpage:

Method 2

This method explains how to read many web pages listed in
a SQL Server table and store those web pages as html files.

Step1:

The following script creates the database and tables for
Scanning Web Pages:

Create Database WebPageScanner
go
use WebPageScanner
go
Create Table WebPages(Id int identity(1,1), WebPage varchar(500))
go
Insert into WebPages(WebPage)  select ‘http://www.microsoft.com’
Insert into WebPages(WebPage)  select ‘http://www.oracle.com’
Insert into WebPages(WebPage)  select ‘https://www.databasejournal.com’
Insert into WebPages(WebPage)  select ‘http://www.dell.com’
Go
use master
go
sp_addlogin ‘WebPageUser’,’Web’,’WebPageScanner’
go
use WebPageScanner
go
sp_adduser ‘WebPageUser’
go
sp_addrolemember ‘db_datawriter’, ‘WebPageUser’
go
sp_addrolemember ‘db_datareader’, ‘WebPageUser’
go

Step2:

Copy and paste the code below into a text file and save as
C:\ScanWebPage\ScanWebPagefromSQL.vbs

‘Author: MAK
‘Contact: mak_999@yahoo.com
‘Objective: To scan many websites from SQL Table and store it as HTML files
Set AdCn = CreateObject(“ADODB.Connection”)
AdCn.commandtimeout =36000
Set AdRec1 = CreateObject(“ADODB.Recordset”)
AdCn.Open = “Provider=SQLOLEDB.1;Data Source=SQL2k\instance1;
Initial Catalog=WebPageScanner;user id = ‘WebPageuser’;password=’Web’ ”
SQL1 = “Select WebPage from WebPages order by [ID]”
AdRec1.Open SQL1, AdCn,1,1
while not Adrec1.EOF
WebPage= Adrec1(“WebPage”)
Outputfile=replace(webpage,”http://”,””)
Outputfile=”c:\Scanwebpage\”+Outputfile+”.html”
‘msgbox webpage
‘msgbox outputfile
Set objXMLhttp = CreateObject(“MSXML2.XMLhttp”)
With objXMLhttp
  .open “GET”, webpage, False
	On Error Resume Next
  .send
  If Err.Number <> 0 Then
    Msgbox “XMLhttp error ” & Hex(Err.Number) & ” ” & Err.Description
  ElseIf .status <> 200 Then
    MsgBox “http error ” & CStr(.status) & ” ” & .statusText
  Else
    Set objFSO = CreateObject(“Scripting.FileSystemObject”)
    Set objTS = objFSO.CreateTextFile(outputfile, True)
    objTS.Write Replace(.responseText, vbLf, vbNewLine)
    objTS.Close
    Set objTS = Nothing
    Set objFSO = Nothing
  End If
end with
Set objXMLhttp =nothing
Adrec1.movenext
Wend
MsgBox “Completed writing all Web Pages!”

Step 3:

Executing the
above VB script will create one HTML file for each website listed in the SQL
table Webpages. Upon completion, you will be prompted with the following
message box:

Three files will have been created under c:\scanwebpage:

Database Journal Logo

DatabaseJournal.com publishes relevant, up-to-date and pragmatic articles on the use of database hardware and management tools and serves as a forum for professional knowledge about proprietary, open source and cloud-based databases--foundational technology for all IT systems. We publish insightful articles about new products, best practices and trends; readers help each other out on various database questions and problems. Database management systems (DBMS) and database security processes are also key areas of focus at DatabaseJournal.com.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.