由于工作關(guān)系,需要工作當(dāng)中,需要讀取DBF文件,找了一些DBF讀取開源軟件,要么是太過龐大,動(dòng)不動(dòng)就上萬行,要么是功能有問題,編碼,長度,總之是沒有找到一個(gè)非常爽的。在萬般無奈之下,我老人家怒從心頭起,惡向膽邊生,決定自己寫一下。結(jié)果只用了不到300行代碼就搞定了,當(dāng)然搞定不是唯一目標(biāo),還要優(yōu)雅簡(jiǎn)潔的搞定,親們跟隨我的腳步一起感受一下簡(jiǎn)潔的設(shè)計(jì)與實(shí)現(xiàn)吧。
在開始編碼之前,先介紹一下DBF,這個(gè)DBF可是個(gè)老東西,在DOS時(shí)代就已經(jīng)出現(xiàn),并且風(fēng)騷了相當(dāng)一段時(shí)間,后來隨著大型數(shù)據(jù)庫的應(yīng)用,它逐步?jīng)]落,但是由于其簡(jiǎn)潔易用的特點(diǎn),還是應(yīng)用在大量的數(shù)據(jù)交換當(dāng)中。但是其發(fā)展過程中,也形成了許多種版本,不同版本的結(jié)構(gòu)不一樣,也就決定 了其解析程序也是不一樣的。
今天我只實(shí)現(xiàn)了Foxbase/DBaseIII的解析,但是也為擴(kuò)展各種其它版本做好了準(zhǔn)備。
http://wiki.jikexueyuan.com/project/open-source-framework-diy/images/2.1.jpg" alt="" />
接口設(shè)計(jì)
上面一共就兩個(gè)類,一個(gè)接口,F(xiàn)ield和Header就是兩個(gè)簡(jiǎn)單的POJO類,分別定義了文件頭及字段相關(guān)的信息。 Reader接口是DBF文件讀取的接口,主要定義了獲取文件類型,編碼,字段以及記錄移動(dòng)相關(guān)的方法。 代碼實(shí)現(xiàn) 首先實(shí)現(xiàn)Reader的抽象類
public abstract class DbfReader implements Reader {
protected String encode = "GBK";
private FileChannel fileChannel;
protected Header header;
protected List<Field> fields;
private boolean recordRemoved;
int position = 0;
static Map<Integer, Class> readerMap = new HashMap<Integer, Class>();
static {
addReader(3, FoxproDBase3Reader.class);
}
public static void addReader(int type, Class clazz) {
readerMap.put(type, clazz);
}
public static void addReader(int type, String className) throws ClassNotFoundException {
readerMap.put(type, Class.forName(className));
}
public byte getType() {
return 3;
}
public String getEncode() {
return encode;
}
public Header getHeader() {
return header;
}
public List<Field> getFields() {
return fields;
}
public boolean isRecordRemoved() {
return recordRemoved;
}
public static Reader parse(String dbfFile, String encode) throws IOException, IllegalAccessException, InstantiationException {
return parse(new File(dbfFile), encode);
}
public static Reader parse(String dbfFile) throws IOException, IllegalAccessException, InstantiationException {
return parse(new File(dbfFile), "GBK");
}
public static Reader parse(File dbfFile) throws IOException, IllegalAccessException, InstantiationException {
return parse(dbfFile, "GBK");
}
public static Reader parse(File dbfFile, String encode) throws IOException, IllegalAccessException, InstantiationException {
RandomAccessFile aFile = new RandomAccessFile(dbfFile, "r");
FileChannel fileChannel = aFile.getChannel();
ByteBuffer byteBuffer = ByteBuffer.allocate(1);
fileChannel.read(byteBuffer);
byte type = byteBuffer.array()[0];
Class<Reader> readerClass = readerMap.get((int) type);
if (readerClass == null) {
fileChannel.close();
throw new IOException("不支持的文件類型[" + type + "]。");
}
DbfReader reader = (DbfReader) readerClass.newInstance();
reader.setFileChannel(fileChannel);
reader.readHeader();
reader.readFields();
return reader;
}
public void setFileChannel(FileChannel fileChannel) {
this.fileChannel = fileChannel;
}
protected abstract void readFields() throws IOException;
public void moveBeforeFirst() throws IOException {
position = 0;
fileChannel.position(header.getHeaderLength());
}
/**
* @param position 從1開始
* @throws java.io.IOException
*/
public void absolute(int position) throws IOException {
checkPosition(position);
this.position = position;
fileChannel.position(header.getHeaderLength() + (position - 1) * header.getRecordLength());
}
private void checkPosition(int position) throws IOException {
if (position >= header.getRecordCount()) {
throw new IOException("期望記錄行數(shù)為" + (this.position + 1) + ",超過實(shí)際記錄行數(shù):" + header.getRecordCount() + "。");
}
}
protected abstract Field readField() throws IOException;
protected abstract void readHeader() throws IOException;
private void skipHeaderTerminator() throws IOException {
ByteBuffer byteBuffer = ByteBuffer.allocate(1);
readByteBuffer(byteBuffer);
}
public void close() throws IOException {
fileChannel.close();
}
public void next() throws IOException {
checkPosition(position);
ByteBuffer byteBuffer = ByteBuffer.allocate(1);
readByteBuffer(byteBuffer);
this.recordRemoved = (byteBuffer.array()[0] == '*');
for (Field field : fields) {
read(field);
}
position++;
}
public boolean hasNext() {
return position < header.getRecordCount();
}
private void read(Field field) throws IOException {
ByteBuffer buffer = ByteBuffer.allocate(field.getLength());
readByteBuffer(buffer);
field.setStringValue(new String(buffer.array(), encode).trim());
field.setBuffer(buffer);
}
protected void readByteBuffer(ByteBuffer byteBuffer) throws IOException {
fileChannel.read(byteBuffer);
}
}
這個(gè)類是最大的一個(gè)類,值得注意的是幾個(gè)靜態(tài)方法: addReader和parse, addReader用于增加新的類型的Reader,parse用于解析文件。
parse的執(zhí)行過程是首先讀取第一個(gè)字節(jié),判斷是否有對(duì)應(yīng)的解析實(shí)現(xiàn)類,如果有,就有對(duì)應(yīng)的解析實(shí)現(xiàn)類去解析,如果沒有,則拋出錯(cuò)誤聲明不支持。
下面寫實(shí)現(xiàn)類就簡(jiǎn)單了,下面是FoxproDBase3的解析器:
public class FoxproDBase3Reader extends DbfReader {
protected void readFields() throws IOException {
fields = new ArrayList<Field>();
for (int i = 0; i < (header.getHeaderLength() - 32 - 1) / 32; i++) {
fields.add(readField());
}
}
public byte getType() {
return 3;
}
protected Field readField() throws IOException {
Field field = new Field();
ByteBuffer byteBuffer = ByteBuffer.allocate(32);
readByteBuffer(byteBuffer);
byte[] bytes = byteBuffer.array();
field.setName(new String(bytes, 0, 11, encode).trim().split("\0")[0]);
field.setType((char) bytes[11]);
field.setDisplacement(Util.getUnsignedInt(bytes, 12, 4));
field.setLength(Util.getUnsignedInt(bytes, 16, 1));
field.setDecimal(Util.getUnsignedInt(bytes, 17, 1));
field.setFlag(bytes[18]);
return field;
}
protected void readHeader() throws IOException {
header = new Header();
ByteBuffer byteBuffer = ByteBuffer.allocate(31);
readByteBuffer(byteBuffer);
byte[] bytes = byteBuffer.array();
header.setLastUpdate((Util.getUnsignedInt(bytes, 0, 1) + 1900) * 10000 + Util.getUnsignedInt(bytes, 1, 1) * 100 + Util.getUnsignedInt(bytes, 2, 1));
header.setRecordCount(Util.getUnsignedInt(bytes, 3, 4));
header.setHeaderLength(Util.getUnsignedInt(bytes, 7, 2));
header.setRecordLength(Util.getUnsignedInt(bytes, 9, 2));
}
}
測(cè)試用例
public class DbfReaderTest {
static String[] files = {"BESTIMATE20140401", "BHDQUOTE20140401"};
public static void main(String[] args) throws IOException, IllegalAccessException, InstantiationException {
for (String file : files) {
printFile(file);
}
}
public static void printFile(String fileName) throws IOException, InstantiationException, IllegalAccessException {
Reader dbfReader = DbfReader.parse("E:\\20140401\\" + fileName + ".DBF");
for (Field field : dbfReader.getFields()) {
System.out.printf("name:%s %s(%d,%d)\n", field.getName(), field.getType(), field.getLength(), field.getDecimal());
}
System.out.println();
for (int i = 0; i < dbfReader.getHeader().getRecordCount(); i++) {
dbfReader.next();
for (Field field : dbfReader.getFields()) {
System.out.printf("%" + field.getLength() + "s", field.getStringValue());
}
System.out.println();
}
dbfReader.close();
}
}
可以看到最后的使用也是非常簡(jiǎn)潔的。 代碼統(tǒng)計(jì)
http://wiki.jikexueyuan.com/project/open-source-framework-diy/images/2.2.jpg" alt="" />
總共的代碼行數(shù)是282行,去掉import和接口聲明之類的,真正干活的代碼大概就200行了:
總結(jié)上面不僅展示了如何實(shí)現(xiàn)DBF文件的解析,同時(shí)還展示了如何在現(xiàn)在面臨的需求與未來的擴(kuò)展進(jìn)行合理均衡的設(shè)計(jì)方式。
比如:要實(shí)現(xiàn)另外一個(gè)標(biāo)準(zhǔn)的DBF文件支持,只要類似上面FoxproDBase3Reader類一樣,簡(jiǎn)單實(shí)現(xiàn)之后,再調(diào)用DbfParser.addReader(xxxReader);
好的設(shè)計(jì)需要即避免過度設(shè)計(jì),搞得太復(fù)雜,同時(shí)也要對(duì)未來的變化與擴(kuò)展做適當(dāng)考慮,避免新的需求來的時(shí)候需要這里動(dòng)動(dòng),那里改改導(dǎo)致結(jié)構(gòu)上的調(diào)整與變化,同時(shí)要注意遵守DRY原則,可以這樣說如果程序中有必要的大量的重復(fù),就說明一定存在結(jié)構(gòu)設(shè)計(jì)上的問題。