Posts Tagged ‘import’

Migrating BlinkList bookmarks and Powermarks bookmarks to del.icio.us

Friday, July 11th, 2008

It´s done. I was suffering constant problems with BlinkList and I decided to move to del.icio.us. I also decided to rescue the old powermarks 3.5 bookmarks from the oblivion and import them to del.icio.us too.

BlinkList gives you the option of exporting your bookmarks in JSON format via the Options ->Export links. (here is the link)

So grab the json file and save it somewhere in your disk.

Then you have to use the script below to load the bookmars into del.icio.us but first make sure that you have ruby or jruby, rubygems, json-jruby or json-ruby, jruby-openssl and rubilicious installed.

If you use jruby you can install everything in the following way:

jruby -S gem install json-jruby  jruby-openssl rubilicious-0.2.0.gem

Then use the following script to load all the bookmarks in the json file to del.icio.us. Just change the filename and username and password to suit your needs.

#!/usr/bin/ruby
 
require "rubygems"
require "rubilicious"
require "json"
require "date"
require "time"
 
 
def getTime(item)
  dateadd = item['dateadd']
  return Time.at(dateadd) unless dateadd == false
  return Time.now
end
 
def getIsPrivate(item)
  isprivate = item['private']
  return "checked"==isprivate
end
 
def getTags(item)
  item['tag'].gsub(' ', '_').gsub(',',' ')
end
 
json_string = File.new("blinklist20080710.json").read
 
result = JSON.parse(json_string)
 
r = Rubilicious.new('your_delicious_username','your_delicious_password')
 
i=0
for item in result do 
  i += 1
  puts "#{i}: #{item['url']}"
  #next if i < 3229
  r.add(item['url'],item['name'],item['description'], getTags(item), getTime(item), true, getIsPrivate(item))
end 
puts "ended"

If the script fails in the middle of the import don´t worry. just uncomment the “#next if i < 3229″ and change the 3229 to the last bookmark id that was loaded. Rerun the script and it will skip all bookmarks up to the one you write there.

Loading the old powermark file into del.icio.us is a little more complex. You will need two files:

1) state_pattern.rb (from maurice codik’s blog). I´m copying it here for completeness sake

#!/usr/bin/ruby
 
#   Copyright (C) 2006 Maurice Codik - maurice.codik@gmail.com
#
#   Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
#   associated documentation files (the "Software"), to deal in the Software without restriction,
#   including without limitation the rights to use, copy, modify, merge, publish, distribute,
#   sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is
#   furnished to do so, subject to the following conditions:
#
#   The above copyright notice and this permission notice shall be included in all copies or substantial
#   portions of the Software.
#
#   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT
#   LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
#   IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
#   WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
#   SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 
 
# an example:
#
# class Connection
#   include StatePattern
#   state :initial do # you always need a state named initial. this is where you begin.
#     def connect
#       puts "connected"
#    # move to state :connected. all other args to transition_to are passed to the new state's constructor
#       transition_to :connected, "hello from initial state" 
#     end
#     def disconnect
#       puts "not connected yet"
#     end
#   end
#   state :connected do
#     def initialize(msg)
#       puts "initialize got msg: #{msg}"
#     end
#     def connect
#       puts "already connected"
#     end
#     def disconnect
#       puts "disconnecting"
#       transition_to :initial
#     end
#   end
#   def reset
#     puts "reseting outside a state"
#     # you can also change the state from outside of the state objects
#     transition_to :initial
#   end
# end
 
# how's it work:
# Each call to state defines a new subclass of Connection that is stored in a hash. 
# Then, a call to transition_to instantiates one of these subclasses and sets it to the be the active state.
# Method calls to Connection are delegated to the active state object via method_missing. 
 
module StatePattern
  class UnknownStateException < Exception
  end
 
  def StatePattern.included(mod)
    mod.extend StatePattern::ClassMethods
  end
 
  module ClassMethods
    attr_reader :state_classes
    def state(state_name, &block)
      @state_classes ||= {}
 
      new_klass = Class.new(self, &block)
      new_klass.class_eval do
        alias_method :__old_init, :initialize
        def initialize(context, *args, &block)
          @context = context
          __old_init(*args, &block)
        end
      end
 
      @state_classes[state_name] = new_klass
    end
  end
 
  attr_accessor :current_state, :current_state_obj
 
  def transition_to(state_name, *args, &block)
    new_context = @context || self
 
    klass = new_context.class.state_classes[state_name]
    if klass
      new_context.current_state = state_name
      new_context.current_state_obj = klass.new(new_context, *args, &block)
    else
      raise UnknownStateException, "tried to transition to unknown state, #{state_name}"      
    end
  end
 
  def method_missing(method, *args, &block)
    unless @current_state_obj
      transition_to :initial
    end
    if @current_state_obj
      @current_state_obj.send(method, *args, &block)
    else
      super
    end
  end
 
end

2) The script that parses the powermarks file and load it to del.icio.us

#!/usr/bin/ruby
require "rubygems"
require "rubilicious"
require "json"
require "date"
require "time"
require "state_pattern"
 
class Parser
  include StatePattern
 
  attr_accessor :name, :url,:desc,:tags ,:date , :r
 
  state :initial do     
    def parse(line)
      #puts "initial: #{line}"
      if line  =~ /<a href="(.*)">(.*)<\/a>/
        @context.name = $2 
        @context.url = $1  
        transition_to :read_keywords        
      end
    end
  end
  state :read_keywords do 
    def parse(line)
      if line =~ /<!--keywords-->(.*)$/
        @context.tags = $1.chomp
        transition_to :read_keywords2
      end
    end
  end
 
  state :read_keywords2 do
    def parse(line)      
      #puts "read_keywords2: #{line}"       
      if line =~ /<!--/
        if line =~ /^<!--desc-->/
          transition_to :read_desc
          @context.current_state_obj.parse(line)
        end         
        if line =~ /^<!--mdata/
          transition_to :read_metadata
          @context.current_state_obj.parse(line)
        end
        return
      end
      @context.tags += " " + line.chomp      
    end
  end
 
  state :read_desc do 
    def parse(line)      
      if line =~ /<!--/
        if line =~ /<!--desc-->(.*)/
          @context.desc = $1.chomp
        else
          #puts "not desc"
          if line =~ /<!--mdata/
            transition_to :read_metadata
            @context.current_state_obj.parse(line)            
          else
            raise "don´t know how to parse this in this state #{line}"
          end              
          return
        end               
      else
        @context.desc += " " + line.chomp
      end
    end
  end
 
  state :read_metadata do    
    def parse(line)      
      @context.date = $1.hex if line =~ /<!--mdata=\[\w+\]\[([0-9A-F]+)\]\[([0-9A-F]+)\]\[([0-9A-F]+)\]/
      @context.date = Time.now.to_i if @context.date < 0
 
 
      puts "=============================="      
      puts "name: #{@context.name}"
      puts "url:  #{@context.url}"
      puts "tags: #{@context.tags}"
      puts "date: #{Time.at(@context.date)}"
      puts "desc: #{@context.desc}" unless @context.desc.nil?
      puts "=============================="
      @context.r.add(@context.url,@context.name,@context.desc, @context.tags, Time.at(@context.date), false, true)
      @context.name = @context.url = @context.tags = @context.date = @context.desc = nil
 
      transition_to :initial
    end
  end
 
end
 
 
r = Rubilicious.new('your_delicious_username','your_password')
p = Parser.new
p.r = r;
i = 0
File.new("pm3520070703.htm").each { |line| 
  puts i; 
  i += 1 
  #next unless i >2261
  p.parse(line);  
}
 
puts "ended"

(This script will add all the links as private. If you don´t want that behaviour just modify the last parameter in “@context.r.add(@context.url,@context.name,@context.desc, @context.tags, Time.at(@context.date), false, true)” to “false”.)

Again, if the script fails in the middle of the import don´t worry. just uncomment the “#next unless i > 2261″ and change the 2261 to the line number where you want to resume parsing the powermarks file. Rerun the script and it will skip all previous lines.

Hope it helps anybody that it´s trying to escape from Blinklist and/or Powermarks. I successfully imported 3299 blinklist bookmarks and 4000 powermarks bookmarks (a lot of dupes though). By the way, the first script will replace any previous bookmark with the same url and the second script will not. That´s the way I wanted it but of course you can change it. The parameter before the last one in the call to add is the one that control the “replace”. (see add documentation).

Merging two TikiWiki’s

Friday, June 29th, 2007

I’ve created a ruby script to merge the content of a TikiWiki into another one. This script will read the tiki_pages, tiki_history and tiki_links tables from the MySQL backend of the source TikiWiki and import the contents into the destination TikiWiki. The script is ’safe’, meaning that it will not overwrite any page if it already exists in the destination. The history of the page will be merged as well if the page exists at the destination. The script doesn’t work with page attachments yet.

Here is the script, be sure to change the script parameters depending on your user/password on the databases (search oldtiki and newtiki variables in the script) and the charsets (search SET_CHARSET_NAME in the script).

#
# script to merge two tikiwiki s
#
require "rubygems"
require "mysql"
require "log4r"
require "iconv"
include Log4r
 
def createInsert(dbh,  table,  row ) 
      insertSt = "insert into #{table} "
      col ="("
      values = "VALUES("
 
      row.each { |key, value|
        col += "#{key},"
        if value and not value.empty?  then
          values +=  "'#{dbh.escape_string(value)}',"
        else 
          values += "NULL,"
        end    
      }
      col = col[0..-2] + ")"      
      values = values[0..-2] + ")"
      insertSt += col + " " + values
      insertSt +";"
      insertSt
end
 
 
 
$conflictNotif = {}
def addToConflictNotifList (user, pagename) 
 
  pageList = $conflictNotif[user]
  if not pageList then 
    $conflictNotif[user] = []
    pageList = $conflictNotif[user]
  end
  pageList << pagename        
end
 
$notif = {}
def addToNotifList (user, pagename) 
 
  pageList = $notif[user]
  if not pageList then 
    $notif[user] = []
    pageList = $notif[user]
  end
  pageList << pagename        
end
 
def printToFile (path, hash) 
  fileConflicts   = File.open(path, File::WRONLY|File::TRUNC|File::CREAT) 
  hash.each { |user, list|
     fileConflicts.print "#{user}: "
     fileConflicts.print list.join(',')
     fileConflicts.print "\n"
  }
  fileConflicts.close
end
 
Log4r::Logger.root.level = Log4r::DEBUG
 
l = Logger.new 'tiki_pages'
l.outputters = Outputter.stdout,FileOutputter.new("tiki_pages", :filename => "tiki_pages.txt", :trunc => true, :level => Log4r::DEBUG)
 
lh = Logger.new 'tiki_history'
 
lh.outputters = Outputter.stdout,FileOutputter.new("tiki_history", :filename => "tiki_history.txt", :trunc => true, :level => Log4r::DEBUG)
 
 
 
lm = Logger.new 'mysqlstatements'
lm.outputters = FileOutputter.new("sqlfile", :filename => "commands.sql", :trunc => true, :level=>Log4r::DEBUG)
 
 
l.debug "Starting migration script"
 
 
oldtiki="sql1.example.com"
olduser="root"
olddbname="tiki"
oldpwd="secret"
 
newtiki="sql2.example.com"
newuser="root"
newdbname="rd_tiki_wiki"
newpwd="secret"
 
#select login,email from users_users;
#mysql> select tiki_pages.pagename,users_users.email from tiki_pages,users_users where tiki_pages.user=users_users=login;
 
 
begin
  #connect to the MySQL server
  l.debug "trying to connect..."
  dbhold = Mysql.init
  dbhold.options(Mysql::SET_CHARSET_DIR, "/root/tikiWikiScript/charsets/")
  dbhold.options(Mysql::SET_CHARSET_NAME, "utf8")
  dbhold.real_connect(oldtiki,olduser, oldpwd,olddbname)
  # get server version string and display it
  l.info "#{oldtiki} mysql version: " + dbhold.get_server_info
 
  dbhnew = Mysql.init
  dbhnew.options(Mysql::SET_CHARSET_DIR, "/root/tikiWikiScript/charsets/")
  dbhnew.options(Mysql::SET_CHARSET_NAME, "latin1")
  dbhnew.real_connect(newtiki,newuser,newpwd,newdbname,3307)
  l.info "#{newtiki} mysql version: " + dbhnew.get_server_info
 
  l.info "retrieving all pagenames from old tiki..."
 
  res= dbhold.query("select * from tiki_pages");
 
  num_pages_old_tiki = res.num_rows
  l.info "number of pages in old tiki : #{num_pages_old_tiki}"
  insertions = 0
  conflicts = 0
  history_updates = 0
  history_version_conflicts = 0
  while row = res.fetch_hash do
    #l.debug row
    pagename = row["pageName"]
    lastModif = row["lastModif"]
    user = row["user"]
    creator = row["creator"]
 
    l.debug "checking pageName='#{pagename}'"
    escapePageName = dbhnew.escape_string(pagename)
    l.debug "escapePageName='#{escapePageName}'"
    query = "select lastModif,user,creator from tiki_pages where pageName='#{escapePageName}'"
    res2 = dbhnew.query(query)
 
    if (res2.num_rows == 0) then
      pageid =  dbhnew.query("select max(page_id)+1 from tiki_pages").fetch_row[0]
 
      l.info "Creating page '#{pagename}' with pageid #{pageid}"        
 
      row["page_id"] = pageid;
      insertSt = createInsert(dbhnew,  "tiki_pages", row ) 
      lm.debug "#{insertSt}"   
 
      dbhnew.query "#{insertSt}"   
 
      addToNotifList( user, pagename);
      addToNotifList( creator, pagename) unless user == creator
      insertions += 1
    else
      if (res2.num_rows > 1) then
        l.error "database invariant violated: entry for pagename #{pagename} not found in the new tiki"
        fail
      end
      row2 = res2.fetch_hash
      lastModif2 = row2["lastModif"]
      l.debug "Comparing last modification of page #{pagename} in old tiki with same page in new tiki"
      if (lastModif > lastModif2) then                
        l.warn "pagename \"#{pagename}\" is newer in #{oldtiki} than in #{newtiki}"
        l.warn "we should send an email to #{user} and #{creator}"
 
        addToConflictNotifList( user, pagename)
        addToConflictNotifList( creator, pagename) unless user == creator
 
        conflicts += 1
      end
 
 
 
    end
    res2.free
    historyRes =  dbhold.query("select * from tiki_history where pageName='#{escapePageName}' ORDER BY version ASC")
    anyHistoryUpdate = false;
    while oldTikiHistoryEntry = historyRes.fetch_hash do
      m = oldTikiHistoryEntry["lastModif"]
      historyRes2 = dbhnew.query("select pageName from tiki_history where pageName='#{escapePageName}' and lastModif='#{m}'")
      if (historyRes2.num_rows == 0) then
      # this history entry was not present we must add new entry
        version = oldTikiHistoryEntry["version"]        
        lh.info "adding version #{version} last updated on #{oldTikiHistoryEntry["lastModif"]} to tiki_history for page '#{pagename}'"
        historyRes3 = dbhnew.query("select lastModif from tiki_history where pageName='#{escapePageName}' and version='#{version}'")
        if (historyRes3.num_rows == 0) then
          lh.debug "version: #{version} not present in #{newtiki}. Insert the entry"
        else
          lh.debug "version: #{version} of '#{pagename}' already exists in #{newtiki}. Now we have to insert the entry in the middle."
          dbhnew.query("update tiki_history set version=version+1 where pageName='#{escapePageName}' and version >= '#{version}' ORDER BY version DESC")
          history_version_conflicts += 1
        end
        insertSt = createInsert(dbhnew, "tiki_history",  oldTikiHistoryEntry ) 
        lm.info "#{insertSt}"
        dbhnew.query "#{insertSt}"
        history_updates += 1
        anyHistoryUpdate = true;
      else 
        lh.debug "The history entry with modification date #{m} of pageName=#{pagename} was already present in the tiki_history of #{newtiki}. Skipping it"
      end
 
 
    end
    l.info "History of page #{pagename} updated." if anyHistoryUpdate
    historyRes.free
  end
  l.info "number of pages in old tiki : #{num_pages_old_tiki}"
  l.info "number of insertions in the new wiki #{insertions}"
  l.info "number of conflicts in the new wiki #{conflicts}"
  l.info "number of history updates #{history_updates}"
  l.info "number of history version conflicts #{history_version_conflicts}"
 
  dateString = Time.now.to_s
  filename = "conflicts"+dateString+".txt"
  printToFile(filename, $conflictNotif)
  l.info "#{filename} created"
  filename = "notifications"+dateString+".txt"
  printToFile(filename, $notif)
  l.info "#{filename} created"
 
 
 
  res.free
 
  #Update tiki_links that is responsible for the backlinks feature
  link_insertions = 0
  l.debug "retrieve all links"
  linkRes = dbhold.query "select * from tiki_links"
  while linkRow = linkRes.fetch_hash do
    fromPage = dbhnew.escape_string(linkRow["fromPage"])
    toPage = dbhnew.escape_string(linkRow["toPage"])
    #check if this link already exists
    checkRes = dbhnew.query "select * from tiki_links where fromPage='#{fromPage}' and toPage='#{toPage}'"
    if (checkRes.num_rows == 0) then 
      insertSt = createInsert(dbhnew, "tiki_links", linkRow)
      lm.info "#{insertSt}"
      dbhnew.query insertSt
      link_insertions += 1      
    end    
    checkRes.free
  end
  linkRes.free
  l.info "number of link_insertions #{link_insertions}"
 
 
rescue Mysql::Error => e
  l.error "Error code: #{e.errno}"
  l.error "Error message: #{e.error}"
  l.error "Error SQLSTATE: #{e.sqlstate}" if e.respond_to?("sqlstate")
ensure
  # disconnect from server
  l.debug "disconnecting from database"
  dbhold.close if dbhold
  dbhnew.close if dbhnew
end

You’ll need to install rubygems and Mysql module gem first.

Wordpress importing tools is awesome

Thursday, August 24th, 2006

Wow! I noticed the Import tab in the Wordpress admin tool and I found that is possible to import your old blog from blogger.com (among others). It awesome! works perfectly! All my previous post are now in Wordpress.