[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: encoding issue with ruby binding

From: Andreas Mohr <andi_at_lisas.de>
Date: Thu, 26 Sep 2013 21:35:36 +0200

Hi,

On Wed, Sep 25, 2013 at 11:20:58AM +0200, Stephane D'Alu wrote:
> Version:
> Subversion: 1.8.3
> Ruby: 2.0.0.195
>
> Error message:
> /usr/local/lib/ruby/site_ruby/2.0/svn/info.rb:236:in `===': invalid byte
> sequence in US-ASCII (ArgumentError)
>
> It occurs in the "parse_diff_unified" methods when trying to mach lines
> of "entry.body"
>
>
> How to repeat:
> Having an UTF-8 encoded character in a committed file

It may not be a solution in its entirety or even overly helpful,
but for reference here's some code fragment that I created
to handle such issues in vcproj2cmake
(in this case in filenames, as opposed to file content,
but that does not matter):

# RUBY VERSION COMPAT STUFF

if (RUBY_VERSION < '1.9') # FIXME exact version where it got introduced?
  def rc_string_start_with(candidate, str_start)
    nil != candidate.match(/^#{str_start}/)
  end
else
  def rc_string_start_with(candidate, str_start)
    candidate.start_with?(str_start) # SYNTAX_CHECK_WHITELIST
  end
end

module V2C_Ruby_Compat
  alias string_start_with rc_string_start_with
  module_function :string_start_with
end

# Guards against exceptions due to encountering mismatching-encoding entries
# within the directory.
def dir_entries_grep_skip_broken(dir_entries, regex)
  dir_entries.grep(regex)
rescue ArgumentError => e
  if not V2C_Ruby_Compat::string_start_with(e.message, 'invalid byte sequence')
    raise
  end
  # Hrmpf, *some* entry failed. Rescue operations,
  # by going through each entry manually and logging/skipping broken ones.
  array_collect_compact(dir_entries) do |entry|
    result = nil
    begin
      if not regex.match(entry).nil?
        result = entry
      end
    rescue ArgumentError => e
      if V2C_Ruby_Compat::string_start_with(e.message, 'invalid byte sequence')
        log_error "Dir entry #{entry} has invalid (foreign?) encoding (#{e.message}), skipping!"
        result = nil
      else
        raise
      end
    end
    result
  end
end

> Stephane D'Alu -- Ingenieur Recherche
> Laboratoire CITI / INSA-Lyon

Lyon is nice for vacations :-)

Andreas Mohr
Received on 2013-09-26 21:36:12 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.