I know of nothing ready made, no.
But I'll attach a python script I use to replicate a gallery of pictures, and you will get the idea, I think.
If you can have an export, the simpliest way to do that, is to ask for a database dump (or export, or backup, it depends of the database terminology) and import it in an instance running on your servers.
From there, the migration to your database will be way less difficult.
So, for example, and as we are in the PHP forum, I consider your DB is mysql.
For simplification, I assume their DB is mysql too.
1) They create a dump of their database.
2) You get that dump, and restore it on your server
3) That's all !
If they use another database, then you will need to install an instance of that database.
Then, you (re)create the tables in your db, and write a program that read data from the source system, and write the rows to your target system.
something like (pseudo code !)
PHP Code:
$dbSrc=something();
$dbTrg=somethingElse();
$qSrc="select field1, field2, field3 from table1";
$rSrc=otherSystem_query($qSrc);
while($oSrc=otherSystem_fetch_array($rSrc)){
$qTrg="insert into table1 (field1, field2, field3) values ({$oSrc->field1},{$oSrc->field2},{$oSrc->field3})";
mysql_query($qTrg);
}
Once you have the original datas in your db, the conversion to your structure will need a bit of work, but be easier.
For instance, imagine they have a user table that have everything in 1 table, but you use 2 tables (one for the mandatory infos, and another for the optional infos).
You would transfer it that way:
Code:
/*
source system database name: their
your database name: we
*/
/*
To migrate the user, we first insert what we need in the base table.
the we.userBase.id and we.userDetail.id fields are not auto_increment,
which allows us to keep the relation with the original datas
*/
insert into we.usersBase (id, name, username, email)
select id, name, username, email
from their.users
/*
now, we do the same for the other elements
*/
insert into we.userDetail (birthDate, interest, dogsName)
select id, birthDate, interest, dogsName
from their.users
And finally, this is a little script I used long time ago, to fetch a batch of pages of a gallery.
It's in python, but the logical evolution should be clear.
There is a class "parser", that open each page one after the other, and looks for "<img>" tags in it.
For each tag it founds that contains "pics/" in the path, it gives that url to an "downloader" object that will fetch it and store locally.
Code:
from __future__ import division
import BeautifulSoup, os, sys, random
import threading, time, urllib2
class Downloader(threading.Thread):
def __init__(self, parent, url, dest, origin):
threading.Thread.__init__(self)
self.url=url
self.origin=origin
self.dest=dest
self.parent=parent
if not os.path.exists(dest):
os.mkdir(dest)
def run(self):
ret=False
cpt=0
self.parent.running+=1
file=os.path.basename(self.url)
part=file[0].lower()
locFile='%s/%s/%s'%(self.dest, part, file)
worked=True
if os.path.exists(locFile) and os.path.isfile(locFile):
ret=True
worked=False
log('Downloader :: File %s exists in %s'%(file, part))
while ret==False and cpt<10:
try:
self.parent.lock.acquire()
log('Downloader :: start fetch (%d) >%s<'%(cpt,self.url))
self.parent.lock.release()
con=urllib2.Request(self.url, headers={'User-Agent':'Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11'
,'Accept-Charset':'utf-8;q=0.7,*'},origin_req_host=self.origin
)
hLocal=open(locFile,'w')
hRemote=urllib2.urlopen(con)
hLocal.write(hRemote.read())
hLocal.close()
hRemote.close()
ret=True
except urllib2.URLError, msg:
log('Downloader :: ERROR::%s'%msg)
ret=False
except Exception, msg:
log('Downloader :: UNKNOWN ERROR::%s'%msg)
finally:
cpt+=1
self.parent.lock.acquire()
if worked==True:
log('Downloader :: Finished url %s'%self.url)
self.parent.lock.release()
self.parent.running-=1
return ret
class Parser():
def __init__(self, url):
self.url=str(url)
self.base='http://someplace.com/'
self.dest=os.path.abspath(os.path.join(os.path.dirname(__file__),'someplace_img'))
self.page=None
self.max=6581 #Total nbr of pages to check for new files
self.pattern=str('@@')
self.parsed=[]
self.running=0
self.maxThreads=10
self.lock=threading.Lock()
self.Parse()
def Parse(self):
while len(self.parsed)<self.max:
while self.page==None or self.page in self.parsed:
self.page=random.randint(0,self.max)
log('randomly chosen page %d'%(int(self.page)))
url=self.url
url=url.replace(self.pattern,str(self.page))
con=urllib2.Request(url, headers={'User-Agent':'Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11'
,'Accept-Charset':'utf-8;q=0.7,*'}
)
perc=float(1*(len(self.parsed)/self.max))*100
log('completed %f%% (%d/%d)'%(perc,len(self.parsed),self.max))
log('Parsing page at %s'%(url))
gotHandle=False
while gotHandle==False:
try:
handle=urllib2.urlopen(con)
gotHandle=True
except urllib2.URLError, msg:
time.sleep(1)
gotHandle=False
source=''
for lines in handle:
source+=lines
try:
soup = BeautifulSoup.BeautifulSoup(source,fromEncoding="ascii")
for img in soup.findAll('img'):
if img['src'].find('pics/')>-1:
imgUrl='%s%s'%(self.base,img['src'])
dwn=Downloader(self, imgUrl,self.dest, url)
while self.running>=self.maxThreads:
#log('Too many threads: %d. Sleeping'%self.running)
time.sleep(1)
dwn.start()
except UnicodeDecodeError:
log('Beautifulsoup could not parse %s because of invalid utf-8 chars in the source'%(url))
while self.running>0:
time.sleep(3)
self.parsed.append(self.page)
def log(string):
print string
if __name__=='__main__':
url='http://someplace.com/index.php?pageno=@@&sort=ever'
parser=Parser(url)